Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,234 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "a1b2c3d4",
"metadata": {},
"source": [
"![QuantConnect Logo](https://cdn.quantconnect.com/web/i/icon.png)\n",
"<hr>"
]
},
{
"cell_type": "markdown",
"id": "e5f6a7b8",
"metadata": {},
"source": [
"## Brain ML Stock Ranking Research\n",
"\n",
"This notebook studies whether Brain ML stock rankings help explain next future returns."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c9d0e1f2",
"metadata": {},
"outputs": [],
"source": [
"qb = QuantBook()\n",
"# Daily bars will have an end_time that matches the following midnight.\n",
"qb.settings.daily_precise_end_time = False"
]
},
{
"cell_type": "markdown",
"id": "a3b4c5d6",
"metadata": {},
"source": [
"### Build a ML Ranking Universe\n",
"\n",
"Select assets with consistent positive momentum across the 2-, 3-, and 5-day Brain ML horizons, then inspect the returned universe history."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e7f8a9b0",
"metadata": {},
"outputs": [],
"source": "def select_assets(data: List[BrainStockRankingUniverse]) -> List[Symbol]:\n # Filter for stocks with positive rankings across 2-day, 3-day, and 5-day horizons.\n return [d.symbol for d in data\n if d.rank_2_days and d.rank_2_days > 0.05 and\n d.rank_3_days and d.rank_3_days > 0.05 and\n d.rank_5_days and d.rank_5_days > 0.05]\n\n# Add the Brain ML Stock Ranking universe.\nuniverse = qb.add_universe(BrainStockRankingUniverse, select_assets)\n# Request recent universe history.\nuniverse_history = qb.universe_history(universe, qb.time - timedelta(14), qb.time - timedelta(1), flatten=True)\n# Print the returned shape and columns.\nprint(f\"Shape: {universe_history.shape}\")\nprint(f\"Columns: {list(universe_history.columns)}\")\nuniverse_history.head()"
},
{
"cell_type": "markdown",
"id": "c1d2e3f4",
"metadata": {},
"source": [
"### Universe Diagnostics\n",
"\n",
"Check how many assets pass the filter each day and summarize the factors."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a5b6c7d8",
"metadata": {},
"outputs": [],
"source": [
"factors = ['rank2days', 'rank3days', 'rank5days']\n",
"# Count selected assets by day.\n",
"universe_size = universe_history.groupby(level='time').size()\n",
"print(f\"Universe days: {len(universe_size)}\")\n",
"# Store the selected symbol list.\n",
"unique_assets = list(universe_history.index.levels[1].unique())\n",
"print(f\"Mean basket size per day: {universe_size.mean():.1f}\")\n",
"for factor in factors:\n",
" print('')\n",
" print(universe_history[factor].describe())\n",
"universe_size.plot(title='Daily Universe Size', ylabel='Universe Size');"
]
},
{
"cell_type": "markdown",
"id": "e9f0a1b2",
"metadata": {},
"source": [
"### Daily Universe Prices\n",
"\n",
"Fetch daily price history for every symbol that appears in the universe."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c3d4e5f6",
"metadata": {},
"outputs": [],
"source": [
"# Get all symbols that appeared in the universe.\n",
"symbols = list(universe_history.index.levels[1].unique())\n",
"# Request daily price history with one extra day for start-time alignment.\n",
"history = qb.history(symbols, universe_history.index[0][0] - timedelta(1), qb.time, Resolution.DAILY)\n",
"history"
]
},
{
"cell_type": "markdown",
"id": "a7b8c9d0",
"metadata": {},
"source": [
"### Align Rankings And Returns\n",
"\n",
"Build a joined table of factors and the future return."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e1f2a3b4",
"metadata": {},
"outputs": [],
"source": [
"# Combine the factor values with future returns in a DataFrame.\n",
"dataset = universe_history[factors].join(history.open.unstack(0).pct_change().shift(-2).stack().rename('futurereturn').rename_axis(universe_history.index.names)).dropna()\n",
"dataset.head()"
]
},
{
"cell_type": "markdown",
"id": "c5d6e7f8",
"metadata": {},
"source": [
"### Analyze Relationships Between Factor and Future Returns\n",
"\n",
"Create a scatterplot and plot the line of best fit for each factor."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a9b0c1d2",
"metadata": {},
"outputs": [],
"source": [
"y = dataset.futurereturn\n",
"for factor in factors: \n",
" # Assign the factor values and future returns.\n",
" x = dataset[factor]\n",
" # Fit a simple linear model.\n",
" slope, intercept = np.polyfit(x, y, 1)\n",
" r_squared = x.corr(y) ** 2\n",
" # Print the linear model statistics.\n",
" print(f\"Factor: {factor}\")\n",
" print(f\"Observations: {len(dataset)}\")\n",
" print(f\"Mean future return: {y.mean():.2%}\")\n",
" print(f\"Alpha: {intercept:.2%}\")\n",
" print(f\"Beta: {slope:.2%}\")\n",
" print(f\"R-squared: {r_squared:.2%}\")\n",
" # Plot the factor values against future returns.\n",
" plt.scatter(x, y, alpha=0.6)\n",
" plt.plot(x.sort_values(), intercept + slope * x.sort_values(), color='tab:red', label='Linear fit')\n",
" plt.axhline(0, color='black', linewidth=1, alpha=0.4)\n",
" plt.title(f'{factor} vs Future Return')\n",
" plt.xlabel(factor)\n",
" plt.ylabel('Future Return')\n",
" plt.legend()\n",
" plt.show()"
]
},
{
"cell_type": "markdown",
"id": "e3f4a5b6",
"metadata": {},
"source": [
"Create a box plot of the ranking quintiles compared to future returns."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c7d8e9f0",
"metadata": {},
"outputs": [],
"source": [
"for factor in factors:\n",
" # Split factor values into quantile buckets.\n",
" x = dataset[factor]\n",
" bucket_count = min(5, x.nunique())\n",
" buckets = pd.qcut(x, q=bucket_count, duplicates='drop')\n",
" # Summarize each bucket with distribution statistics.\n",
" summary = dataset.assign(bucket=buckets).groupby('bucket', observed=True).agg(\n",
" mean_factor=(factor, 'mean'),\n",
" min_future_return=('futurereturn', 'min'),\n",
" max_future_return=('futurereturn', 'max'),\n",
" mean_future_return=('futurereturn', 'mean'),\n",
" std_future_return=('futurereturn', 'std'),\n",
" observations=('futurereturn', 'size')\n",
" ).reset_index()\n",
" summary['bucket'] = summary['bucket'].astype(str)\n",
" # Display the bucket summary.\n",
" print(f\"Factor: {factor}\")\n",
" display(summary.style.format({\n",
" 'mean_factor': '{:.3f}',\n",
" 'min_future_return': '{:.2%}',\n",
" 'max_future_return': '{:.2%}',\n",
" 'mean_future_return': '{:.2%}',\n",
" 'std_future_return': '{:.2%}'\n",
" }))\n",
" # Plot the return distribution for each bucket.\n",
" groups = [y[buckets == b].values for b in buckets.cat.categories]\n",
" plt.boxplot(groups, labels=[str(b) for b in buckets.cat.categories])\n",
" plt.axhline(0, color='black', linewidth=1, alpha=0.4)\n",
" plt.title(f'Future Return by {factor} Bucket')\n",
" plt.xlabel(f'{factor} Bucket')\n",
" plt.ylabel('Future Return')\n",
" plt.xticks(rotation=45)\n",
" plt.show()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.8.0"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading