lebedov/google_finance_intraday.py

ninovasc · 2018-05-31T07:01:00Z

Here some adjusts in URL, columns order and a "main" call:

#!/usr/bin/env python 
"""
Retrieve intraday stock data from Google Finance.
"""
import sys
import csv
import datetime
import re

import pandas as pd
import requests

def get_google_finance_intraday(ticker, exchange, period=60, days=1):
    """
    Retrieve intraday stock data from Google Finance.
    Parameters
    ----------
    ticker : str
        Company ticker symbol
    exchange : str
        Exchange of ticker
    period : int
        Interval between stock values in seconds.
    days : int
        Number of days of data to retrieve.
    Returns
    -------
    df : pandas.DataFrame
        DataFrame containing the opening price, high price, low price,
        closing price, and volume. The index contains the times associated with
        the retrieved price values.
    """

    uri = 'https://www.google.com/finance/getprices' \
          '?i={period}&p={days}d&f=d,o,h,l,c,v&q={ticker}&x={exchange}'.format(ticker=ticker,
                                                                          period=period,
                                                                          days=days,
                                                                          exchange=exchange)
    page = requests.get(uri)
    reader = csv.reader(page.content.splitlines())
    columns = ['Close', 'High', 'Low', 'Open', 'Volume']
    rows = []
    times = []
    for row in reader:
        if re.match('^[a\d]', row[0]):
            if row[0].startswith('a'):
                start = datetime.datetime.fromtimestamp(int(row[0][1:]))
                times.append(start)
            else:
                times.append(start+datetime.timedelta(seconds=period*int(row[0])))
            rows.append(map(float, row[1:]))
    if len(rows):
        return pd.DataFrame(rows, index=pd.DatetimeIndex(times, name='Date'),
                            columns=columns)
    else:
        return pd.DataFrame(rows, index=pd.DatetimeIndex(times, name='Date'))

if __name__ == '__main__':
    if len(sys.argv) == 1:
        print("\nUsage: google_financial_intraday.py EXCHANGE SYMBOL\n\n")
    else:
        exchange = sys.argv[1]
        ticker = sys.argv[2]
        print("--------------------------------------------------")
        print("Processing %s" % ticker)
        print get_google_finance_intraday(ticker,exchange,60,2)
        print("--------------------------------------------------")

simonskok · 2018-08-01T20:42:02Z

Hi guys, thanks for your great work! I wrote a similar function for fetching the intraday google data in R, but as of today the URL is no longer responding, as it directs to the normal google finance website and not to the one with the data table. Any idea whether the API is deprecated or has the URL just changed? Anyone familiar with the new URL?

Excelnabb · 2018-08-03T07:53:36Z

Hi, I have the same problem as simonskok, the URL seems to been stopped working, is the URL changed or does google have stopped their API?

nebaz · 2018-08-03T15:11:52Z

Yes, Google has bloked this api. Do you know any similar online api?

christ-2 · 2018-08-06T06:57:43Z

Same experience here also, damn!! I have yet to find confirmation from google that the API is discontinued. nebaz how do you know google has blocked this API?

kongaraman · 2018-08-06T11:05:59Z

same for me as well, but i found below yahoo link working for 1minute

https://query1.finance.yahoo.com/v7/finance/chart/RADICO.NS?&interval=1m

needs to convert to python code, can some one help to get datetime, open, high, low,close,vol values from this

giogio02 · 2018-08-08T11:35:32Z

Well, with the file $f you get from Yahoo, I wrote a script that uses an old version of gawk and"jq" (maybe json2csv would have been easier..)
You can modify it to change columns and to remove "empty minutes"...

cat $f | jq -r '[.[] | {result:
 [.result[].timestamp,
 .result[].indicators.quote[].close,
 .result[].indicators.quote[].high,
 .result[].indicators.quote[].low,
 .result[].indicators.quote[].open,
 .result[].indicators.quote[].volume
 | @csv] } ] '  | gawk -F"," ' BEGIN {
        nrec=1
}
NF > 2 {
        gsub(/\"/,"")
        for(i=1; i<=NF; i++)
        arr[nrec, i] = $i
        tt=NF
        nrec++
}
END {
        for (t=1;t < tt; t++)
        {
        printf "%d,", arr[1,t]
        for (v=2;v < 6; v++)
                printf "%7.3f,", arr[v,t]
        printf "%d\n", arr[6,t]
        }
}'

Then to convert the timestamp $1 in HH:MM:SS, using variables h, m, s:

s=$(date +%s)
let re=s%86400
let midn=s-re-7200 (that's for my timezone ;-) , modify it using date +%z)

                ds=$1-midn             
                h=ds/3600;
                hre=ds % 3600
                m=hre/60
                s=hre % 60

carly11 · 2018-08-09T04:00:53Z

Here is a script that pulls 60 minute data from yahoo. Copy the contents below into python2.x file. Pass the symbol as the variable. You can change the data interval at the line on bottom of the code.

import requests
import pandas as pd
import arrow
from dateutil.parser import parse
from dateutil.tz import gettz
import datetime
from pprint import pprint
import urllib,time,datetime
import sys

symbol1 = sys.argv[1]
symbolname = symbol1
symbol1 = symbol1.upper()

def get_quote_data(symbol='iwm', data_range='1d', data_interval='60m'):
res = requests.get('https://query1.finance.yahoo.com/v8/finance/chart/{symbol}?range={data_range}&interval={data_interval}'.format(**locals()))
data = res.json()
body = data['chart']['result'][0]
dt = datetime.datetime
dt = pd.Series(map(lambda x: arrow.get(x).to('EST').datetime.replace(tzinfo=None), body['timestamp']), name='dt')
df = pd.DataFrame(body['indicators']['quote'][0], index=dt)
dg = pd.DataFrame(body['timestamp'])
return df.loc[:, ('open', 'high', 'low', 'close', 'volume')]

q = jpy5m = get_quote_data(symbol1, '2d', ' 60m')
print q

kongaraman · 2018-08-09T10:16:53Z

Thanks carly11, very smart code

Modified your code to Python3.65 and made small changes

`import requests
import pandas as pd
import arrow
import datetime

def get_quote_data(symbol='SBIN.NS', data_range='1d', data_interval='1m'):
res = requests.get('https://query1.finance.yahoo.com/v8/finance/chart/{symbol}?range={data_range}&interval={data_interval}'.format(**locals()))
data = res.json()
body = data['chart']['result'][0]
dt = datetime.datetime
dt = pd.Series(map(lambda x: arrow.get(x).to('EST').datetime.replace(tzinfo=None), body['timestamp']), name='Datetime')
df = pd.DataFrame(body['indicators']['quote'][0], index=dt)
dg = pd.DataFrame(body['timestamp'])

return df.loc[:, ('open', 'high', 'low', 'close', 'volume')]

data = get_quote_data('SBIN.NS', '1d', '1m')
data.dropna(inplace=True) #removing NaN rows
print(data)`

When i changed EST to IST (Indian standard time) it throws an error as "ParserError: Could not parse timezone expression "IST" "

Do you have anyidea so that it reflect datetime column correctly

kongaraman · 2018-08-09T10:55:14Z

I am able to change to Indian standard time by using 'Asia/Calcutta'. Referred this from link https://raw.githubusercontent.com/SpiRaiL/timezone/master/timezone.py

Modified code which now works for IST is:

import requests
import pandas as pd
import arrow
import datetime

def get_quote_data(symbol='SBIN.NS', data_range='1d', data_interval='1m'):
    res = requests.get('https://query1.finance.yahoo.com/v8/finance/chart/{symbol}?range={data_range}&interval={data_interval}'.format(**locals()))
    data = res.json()
    body = data['chart']['result'][0]    
    dt = datetime.datetime
    dt = pd.Series(map(lambda x: arrow.get(x).to('Asia/Calcutta').datetime.replace(tzinfo=None), body['timestamp']), name='Datetime')
    df = pd.DataFrame(body['indicators']['quote'][0], index=dt)
    dg = pd.DataFrame(body['timestamp'])
    
    return df.loc[:, ('open', 'high', 'low', 'close', 'volume')]

data = get_quote_data('SBIN.NS', '1d', '1m')
data.dropna(inplace=True) #removing NaN rows
print(data)

kongaraman · 2018-08-09T11:54:12Z

import requests
import pandas as pd
import arrow
import datetime

def get_quote_data(symbol='SBIN.NS', data_range='1d', data_interval='1m'):
    res = requests.get('https://query1.finance.yahoo.com/v8/finance/chart/{symbol}?range={data_range}&interval={data_interval}'.format(**locals()))
    data = res.json()
    body = data['chart']['result'][0]    
    dt = datetime.datetime
    dt = pd.Series(map(lambda x: arrow.get(x).to('Asia/Calcutta').datetime.replace(tzinfo=None), body['timestamp']), name='Datetime')
    df = pd.DataFrame(body['indicators']['quote'][0], index=dt)
    dg = pd.DataFrame(body['timestamp'])    
    df = df.loc[:, ('open', 'high', 'low', 'close', 'volume')]
    df.dropna(inplace=True)     #removing NaN rows
    df.columns = ['OPEN', 'HIGH','LOW','CLOSE','VOLUME']    #Renaming columns in pandas
    
    return df

data = get_quote_data('KPIT.NS', '1d', '1m')
print(data)

carly11 · 2018-08-10T16:44:07Z

Does anyone know why yahoo does not provide 30 minute data? I know that 1, 15, 60 work fine but not 30m.

Rajasekaran1976 · 2018-08-10T17:55:50Z

how do you get 1, 15, 60 minute data from yahoo?

carly11 · 2018-08-10T19:35:36Z

using the script posted just before my question.

ynagendra · 2018-10-18T05:30:26Z

The code above by kongaraman is wonderful. thanks for posting.

kongaraman · 2018-10-24T08:18:33Z

Thanks ynagendra

kongaraman · 2018-10-24T08:22:37Z

I noticed that yahoo data is not constant. For 1 minute, i observed that data gives differently for the same time period.

When i run the above python code, it should be the same for data, as long as i run any number of times.

Can someone have idea on this? Also this data is not exactly matching to actual exchange data.

I am not sure, why there is a difference. If anybody have idea then please post your comments.

kongaraman · 2018-10-30T04:18:25Z

Observed that there are many empty values while extracting data.

So i used df.dropna(inplace=True) #removing NaN rows

Infact the above line is not needed. We need to get good quality data, which is missing from above link.

sanjayubs · 2019-01-02T11:45:22Z

import requests
import pandas as pd
import arrow
import datetime

def get_quote_data(symbol='SBIN.NS', data_range='1d', data_interval='1m'):
    res = requests.get('https://query1.finance.yahoo.com/v8/finance/chart/{symbol}?range={data_range}&interval={data_interval}'.format(**locals()))
    data = res.json()
    body = data['chart']['result'][0]    
    dt = datetime.datetime
    dt = pd.Series(map(lambda x: arrow.get(x).to('Asia/Calcutta').datetime.replace(tzinfo=None), body['timestamp']), name='Datetime')
    df = pd.DataFrame(body['indicators']['quote'][0], index=dt)
    dg = pd.DataFrame(body['timestamp'])    
    df = df.loc[:, ('open', 'high', 'low', 'close', 'volume')]
    df.dropna(inplace=True)     #removing NaN rows
    df.columns = ['OPEN', 'HIGH','LOW','CLOSE','VOLUME']    #Renaming columns in pandas
    
    return df

data = get_quote_data('KPIT.NS', '1d', '1m')
print(data)

Hi,
I am getting error ModuleNotFoundError: No module named 'arrow'. I have installed arrow module, still it throws this error

bertrandobi · 2019-03-09T12:27:28Z

Thanks for great ideas.
I tried to apply kongaraman function on a list of stock tickers. when i pplied a for loop, i got the follwing error:

TypeError: 'NoneType' object has no attribute 'getitem'

rkolagatla · 2019-03-31T13:11:51Z

HI All,
The yahoo finance API seems to be stuck at March 28th 2019 and not returning any data post that. I am trying to import data for the NIFTYBANK NSE Index and also tried for other stocks as well. Are others also facing the same problem and can anyone suggest some alternatives for 1minute Data Provider for NSE Stocks.
Thanks,
Rajesh

peter4apple · 2020-05-20T16:04:32Z

Hi.
Was anyone able to get hourly information on stock prices, either individual or by S&P?

aspiringguru · 2020-07-25T02:54:14Z

seems to have stopped working, now isnt returning anything - could be me of course

URL is http://www.google.com/finance/getprices?i=60&p=1d&f=d,o,h,l,c,v&df=cpct&q=VOD&x=LON
Empty DataFrame
Columns: []
Index: []
done

when I run that URL in the browser, it thinks I'm a bot and shuts me down, any way around that please ?

ta

that url failed for me as well.

We're sorry...
... but your computer or network may be sending automated queries. To protect our users, we can't process your request right now.

earlcharles1 · 2020-11-24T22:42:27Z

import requests
import pandas as pd
import arrow
import datetime

def get_quote_data(symbol='SBIN.NS', data_range='1d', data_interval='1m'):
    res = requests.get('https://query1.finance.yahoo.com/v8/finance/chart/{symbol}?range={data_range}&interval={data_interval}'.format(**locals()))
    data = res.json()
    body = data['chart']['result'][0]    
    dt = datetime.datetime
    dt = pd.Series(map(lambda x: arrow.get(x).to('Asia/Calcutta').datetime.replace(tzinfo=None), body['timestamp']), name='Datetime')
    df = pd.DataFrame(body['indicators']['quote'][0], index=dt)
    dg = pd.DataFrame(body['timestamp'])    
    df = df.loc[:, ('open', 'high', 'low', 'close', 'volume')]
    df.dropna(inplace=True)     #removing NaN rows
    df.columns = ['OPEN', 'HIGH','LOW','CLOSE','VOLUME']    #Renaming columns in pandas
    
    return df

data = get_quote_data('KPIT.NS', '1d', '1m')
print(data)

Works great, but is there a way to save this data to a CSV file?

lebedov · 2020-11-25T03:12:00Z

import requests
import pandas as pd
import arrow
import datetime

def get_quote_data(symbol='SBIN.NS', data_range='1d', data_interval='1m'):
    res = requests.get('https://query1.finance.yahoo.com/v8/finance/chart/{symbol}?range={data_range}&interval={data_interval}'.format(**locals()))
    data = res.json()
    body = data['chart']['result'][0]    
    dt = datetime.datetime
    dt = pd.Series(map(lambda x: arrow.get(x).to('Asia/Calcutta').datetime.replace(tzinfo=None), body['timestamp']), name='Datetime')
    df = pd.DataFrame(body['indicators']['quote'][0], index=dt)
    dg = pd.DataFrame(body['timestamp'])    
    df = df.loc[:, ('open', 'high', 'low', 'close', 'volume')]
    df.dropna(inplace=True)     #removing NaN rows
    df.columns = ['OPEN', 'HIGH','LOW','CLOSE','VOLUME']    #Renaming columns in pandas
    
    return df

data = get_quote_data('KPIT.NS', '1d', '1m')
print(data)

Works great, but is there a way to save this data to a CSV file?

data.to_csv('output.csv')

adgestars007 · 2020-11-27T07:00:45Z

To sum up this string..
Step 1: Install python: https://www.python.org/downloads/
Step 2: Install panda: pip install panda
Step 3: Save this code:
'
import requests
import pandas as pd
import arrow
import datetime

def get_quote_data(symbol='SBIN.NS', data_range='1d', data_interval='1m'):
res = requests.get('https://query1.finance.yahoo.com/v8/finance/chart/{symbol}?range={data_range}&interval={data_interval}'.format(**locals()))
data = res.json()
body = data['chart']['result'][0]
dt = datetime.datetime
dt = pd.Series(map(lambda x: arrow.get(x).to('Asia/Calcutta').datetime.replace(tzinfo=None), body['timestamp']), name='Datetime')
df = pd.DataFrame(body['indicators']['quote'][0], index=dt)
dg = pd.DataFrame(body['timestamp'])

return df.loc[:, ('open', 'high', 'low', 'close', 'volume')]

data = get_quote_data('SBIN.NS', '5d', '1m')
data.dropna(inplace=True) #removing NaN rows
print(data)
data.to_csv('output.csv')
'
Step 4: Drag the saved python file to CMD and an output CSV will be saved.

atulsahire · 2020-12-15T11:37:16Z

what is the symbol for NIFTY50 data

syedshahab698 · 2021-02-21T15:48:20Z

what is the symbol for NIFTY50 data

Yahoo finance symbol for nifty 50 = '^NSEI'

sairam18814 · 2021-05-17T05:03:37Z

import requests
import pandas as pd
import arrow
import datetime

def get_quote_data(symbol , data_range='1d', data_interval='1m'):
res = requests.get('https://query1.finance.yahoo.com/v8/finance/chart/{symbol}?range={data_range}&interval={data_interval}'.format(**locals()))
data = res.json()
body = data['chart']['result'][0]
dt = datetime.datetime
dt = pd.Series(map(lambda x: arrow.get(x).to('Asia/Calcutta').datetime.replace(tzinfo=None), body['timestamp']), name='Datetime')
df = pd.DataFrame(body['indicators']['quote'][0], index=dt)
dg = pd.DataFrame(body['timestamp'])
df = df.loc[:, ('open', 'high', 'low', 'close', 'volume')]
df.dropna(inplace=True)
df.columns = ['OPEN', 'HIGH','LOW','CLOSE','VOLUME']

return df

for s in symbol:
data = get_quote_data(s,'1d','1m')
data.to_csv(s.strip(".NS")+'.csv')

this is for multiple stock symbols

	#!/usr/bin/env python
	"""
	Retrieve intraday stock data from Google Finance.
	"""

	import csv
	import datetime
	import re

	import pandas as pd
	import requests

	def get_google_finance_intraday(ticker, period=60, days=1):
	"""
	Retrieve intraday stock data from Google Finance.

	Parameters
	----------
	ticker : str
	Company ticker symbol.
	period : int
	Interval between stock values in seconds.
	days : int
	Number of days of data to retrieve.

	Returns
	-------
	df : pandas.DataFrame
	DataFrame containing the opening price, high price, low price,
	closing price, and volume. The index contains the times associated with
	the retrieved price values.
	"""

	uri = 'http://www.google.com/finance/getprices' \
	'?i={period}&p={days}d&f=d,o,h,l,c,v&df=cpct&q={ticker}'.format(ticker=ticker,
	period=period,
	days=days)
	page = requests.get(uri)
	reader = csv.reader(page.content.splitlines())
	columns = ['Open', 'High', 'Low', 'Close', 'Volume']
	rows = []
	times = []
	for row in reader:
	if re.match('^[a\d]', row[0]):
	if row[0].startswith('a'):
	start = datetime.datetime.fromtimestamp(int(row[0][1:]))
	times.append(start)
	else:
	times.append(start+datetime.timedelta(seconds=period*int(row[0])))
	rows.append(map(float, row[1:]))
	if len(rows):
	return pd.DataFrame(rows, index=pd.DatetimeIndex(times, name='Date'),
	columns=columns)
	else:
	return pd.DataFrame(rows, index=pd.DatetimeIndex(times, name='Date'))

lebedov/google_finance_intraday.py

ninovasc commented May 31, 2018

Uh oh!

simonskok commented Aug 1, 2018

Uh oh!

Excelnabb commented Aug 3, 2018

Uh oh!

nebaz commented Aug 3, 2018

Uh oh!

christ-2 commented Aug 6, 2018

Uh oh!

kongaraman commented Aug 6, 2018

Uh oh!

giogio02 commented Aug 8, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

carly11 commented Aug 9, 2018

Uh oh!

kongaraman commented Aug 9, 2018

Uh oh!

kongaraman commented Aug 9, 2018

Uh oh!

kongaraman commented Aug 9, 2018

Uh oh!

carly11 commented Aug 10, 2018

Uh oh!

Rajasekaran1976 commented Aug 10, 2018

Uh oh!

carly11 commented Aug 10, 2018

Uh oh!

ynagendra commented Oct 18, 2018

Uh oh!

kongaraman commented Oct 24, 2018

Uh oh!

kongaraman commented Oct 24, 2018

Uh oh!

kongaraman commented Oct 30, 2018

Uh oh!

sanjayubs commented Jan 2, 2019

Uh oh!

bertrandobi commented Mar 9, 2019

Uh oh!

rkolagatla commented Mar 31, 2019

Uh oh!

peter4apple commented May 20, 2020

Uh oh!

aspiringguru commented Jul 25, 2020

Uh oh!

earlcharles1 commented Nov 24, 2020

Uh oh!

lebedov commented Nov 25, 2020

Uh oh!

adgestars007 commented Nov 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

atulsahire commented Dec 15, 2020

Uh oh!

syedshahab698 commented Feb 21, 2021

Uh oh!

sairam18814 commented May 17, 2021

Uh oh!

giogio02 commented Aug 8, 2018 •

edited

Loading

adgestars007 commented Nov 27, 2020 •

edited

Loading