【轉】Python範例：使用機器學習創建終極股票投資組合

本文不需要機器學習的先驗知識，每個步驟提供詳細的說明，每個人都能適用。

本文使用Yahoo Finance提供的"股票推薦"，然後將對Twitter和News Feed資料進行情緒分析，以此作為是否應該購買股票的一種作法。(The article on TESLA can be accessed from the link below.)
Elon’s Golden Gift: Predicting The Stock Price Of Tesla With Twitter And Machine Learning【Go】

專案藍圖

在制定藍圖之前，需要一個簡潔的目標。目標很明確，即：將建立一個機器學習模型，該模型將根據Yahoo Finance Analysts在Twitter和Financial News上對該公司的公眾觀點來評估該公司提供的建議。在執行ML SA(Machine Learning Sentiment Analysis)分析之後，該模型將通知使用者他/她是否應該購買股票，以此方式構建有效的投資組合。

要執行的步驟如下：

· 獲取Yahoo Finance推薦的股票名稱。

· 獲取與這些公司有關的的新聞文章。

· 獲取與這些公司有關的的推文。

· 將推文和文章合併為兩個單獨的字串。

· 對推文和新聞文章進行情感分析SA(Sentiment Analysis)。

· 根據該分數，決定是否應遵循SA的建議。

· 通過Facebook Messenger向用戶發送建議。

瞭解資料

本專案使用的資料是：新聞文章，推文和Yahoo推薦的股票。

推薦系統由一個量表組成，規模可以在上面觀察到。股票的Rating評級極值從1分（強勢買入）到5分（強勢賣出）。就本專案的目的而言，Rating評級的關鍵值應落在1–1.5、3 及4.5–5之間。

Master Machine Learning And NLP Through SpaceX’s Dragon Launch And… Twitter?
An A-Z guide on how to use NLP for determining public opinion for SpaceX’s most recent launch【Go】

執行分析

第一步是導入所有必需的外掛程式庫，以下代碼可用於導入資料。掛入程式庫後，則將遵循實際的庫存建議。決定以S＆P500用於測試。

import requests
import pandas as pd
from yahoo_fin import stock_info as si
from pandas_datareader import DataReader
import numpy as np

from urllib.request import urlopen, Request
from bs4 import BeautifulSoup
import os
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
from nltk.sentiment.vader import SentimentIntensityAnalyzer

通過檢查yahoo提供的輸入，似乎有些股票沒有評級。結果，將為它們分配值" 6"(which is obviously not on the scale)。

tickers = si.tickers_sp500()
recommendations = []

for ticker in tickers:
   lhs_url = 'https://query2.finance.yahoo.com/v10/finance/quoteSummary/'
   rhs_url = '?formatted=true&crumb=swg7qs5y9UP&lang=en-US®ion=US&'
             'modules=upgradeDowngradeHistory,recommendationTrend,'
             'financialData,earningsHistory,earningsTrend,industryTrend&'
             'corsDomain=finance.yahoo.com'

   url = lhs_url + ticker + rhs_url
   r = requests.get(url)
   if not r.ok:
       recommendation = 6
   try:
       result = r.json()['quoteSummary']['result'][0]
       recommendation =result['financialData']['recommendationMean']['fmt']
   except:
       recommendation = 6

   recommendations.append(recommendation)"

在對資料進行了一些調整之後，我創建了一個名為" df"的Pandas資料框，其中包括兩列：公司和建議("Company", and "Recommendations")。

看來資料仍然存在一些關鍵問題。為了解決上述問題，我將"建議"列中的值轉換為float64（它們當前為字串格式），丟棄分配值為" 6"的股票，最後對資料框按升冪進行排序。

df['Recommendations'] = pd.to_numeric(df['Recommendations'])
df = df[df.Recommendations != 6]
df.sort_values(by=['Recommendations'], ascending = True)

如前所述，只有特定關鍵區間內的值才是重要的。因此，我們將新建三個資料框，分別為：" hold_df"，" buy_df"和" sell_df"，然後將其聯繫到" new_df"中。

hold_df = df[df.Recommendations == 3]
buy_df = df[df.Recommendations <= 1.5]
sell_df = df[df.Recommendations >= 4.5]

df_list = [hold_df, buy_df, sell_df]
new_df = pd.concat(df_list)
new_df.reset_index(level=0, inplace=True)

結果，在501個初始庫存中，僅剩餘14個。

It is now time to fetch the News and Twitter feed. Once both feeds are successfully fetched, sentiment analysis for each stock will be individually conducted for each platform and then the two results will be added and divided by two.

現在是時候獲取新聞和Twitter提要了。成功獲取兩個提要後，將針對每個平臺分別進行每種股票的情緒分析(SA)，然後將兩個結果相加並除以二。因此，最終的情緒得分將計算如下：

最終分數=（Twitter情緒評分+新聞訂閱情緒評分）/ 2

下面將詳細介紹【導入新聞提要資料和執行情感分析】的過程。由於該過程不是很費腦筋，因此，不重複twitter的步驟。
對此有興趣的人，可以在連結處，找到Twitter的步驟：Elon’s Golden Gift: Predicting The Stock Price Of Tesla With Twitter And Machine Learning【Go】

獲取新聞資料

首先，創建一個列表，其中將包含14種股票的報價。

tickers =[]

for index, rows in new_df.iterrows():
tickers.append(rows.Company)

接下來，將使用finviz將新聞資料解析為Pandas資料幀。

finwiz_url = 'https://finviz.com/quote.ashx?t='

news_tables = {}

for ticker in tickers:
   url = finwiz_url + ticker
   req = Request(url=url,headers={'user-agent': 'my-app/0.0.1'})
   response = urlopen(req)
   html = BeautifulSoup(response)
   news_table = html.find(id='news-table')
   news_tables[ticker] = news_table

parsed_news = []

for file_name, news_table in news_tables.items():
   for x in news_table.findAll('tr'):

       text = x.a.get_text()
       date_scrape = x.td.text.split()

       if len(date_scrape) == 1:
           time = date_scrape[0]

       else:
           date = date_scrape[0]
           time = date_scrape[1]
       ticker = file_name.split('_')[0]

       parsed_news.append([ticker, date, time, text])

vader = SentimentIntensityAnalyzer()

columns = ['ticker', 'date', 'time', 'headline']

parsed_and_scored_news = pd.DataFrame(parsed_news, columns=columns)

parsed_and_scored_news['date'] = pd.to_datetime(parsed_and_scored_news.date).dt.date

parsed_and_scored_news.head()

為了驗證是否一切都按計劃進行，查看遺下" parsed_and_scored_news"的標題。

一切似乎都如預期。「時間」是不需要的列，將其刪除。經過考慮，另外還決定刪除"日期"列。而，如果您有興趣進行每日交易的分析，建議您保留上一個和當前分析日期。

parsed_and_scored_news = parsed_and_scored_news
.groupby(['ticker'], as_index = False)
.agg({'headline': ''.join}, Inplace=True)

資料的問題在於，以它們的當前形式，它們不能被任何模型使用。因此，我將根據它們所指的公司將每個公司的標題歸為一串。

現在，我們有14行，其中兩行不同。 " ticker"列由公司的代碼和"標題"組成，實際上包含與公司有關的新聞頭條的全部。

Performing Sentiment Analysis(情感分析)

現在一切都已設置好，可以評估推文和新聞資料的情感分數。對於新聞頭條（以及相應的推文）。該過程非常簡單，通過使用Vader可以按以下方式執行：

vader = SentimentIntensityAnalyzer()
scores = parsed_and_scored_news['headline'].apply(vader.polarity_scores).tolist()
scores_df = pd.DataFrame(scores)
parsed_and_scored_news = parsed_and_scored_news.join(scores_df, rsuffix='_right')

現在將情感分數加起來並除以2，結果如下：

完成！每種股票的最終情緒得分可以在名為"compound"的欄中看到。換句話說，該列描述了每個條目的複合情感得分。現在，根據股票是被指定為"賣出"，"買入"還是"持有"，就可以決定採取適當的措施。

集成Facebook Messenger

該模型已準備好為使用者提供建議。為什麼不使它變得更容易呢？有什麼比手機上的日常消息更容易的？為了實現這一目標，接著將創建一個易於製作的具有自動消息發送功能的Facebook機器人。

本案將使用fbchat" Echo-Bot"，該bot的代碼如下。需要更改是的創建一個小的" for迴圈"，該迴圈將遍歷各列並將適當的消息發送給使用者。

from fbchat import log, Client

# Subclass fbchat.Client and override required methods
class EchoBot(Client):
   def onMessage(self, author_id, message_object, thread_id, thread_type, **kwargs):
       self.markAsDelivered(thread_id, message_object.uid)
       self.markAsRead(thread_id)

       log.info("{} from {} in {}".format(message_object, thread_id, thread_type.name))

     # If you're not the author, echo
       if author_id != self.uid:

       self.send("message", thread_id=thread_id, thread_type=thread_type)

client = EchoBot("<username>", "<Password>")
client.listen()

可以在以下連結處找到有關如何創建此類機器人的深入說明：
Making A Revolutionary Travel Companion With Machine Learning And Python
【Go 】

【參考】Create The Ultimate Stock Investing Portfolio Using Machine Learning

使用机器学习创建最终的股票投资组合

Jason

The Dance of Disorder (Fluctuations of Entropy)

Jason 發表在痞客邦留言(0) 人氣( 0 )

個人分類：金融投資

▲top

The Dance of Disorder (Fluctuations of Entropy)

Casual notes consist of scattered perspectives on the Existence.
宇宙中估計有無數孤單的波茲曼大腦漂浮在無序中，對於宇宙來說，觀測者更有可能是種隨機漲落出現的意識。

Performing Sentiment Analysis(情感分析)

熱門文章

參觀人氣

The Dance of Disorder (Fluctuations of Entropy)

Casual notes consist of scattered perspectives on the Existence.宇宙中估計有無數孤單的波茲曼大腦漂浮在無序中，對於宇宙來說，觀測者更有可能是種隨機漲落出現的意識。