Yahoo Finance for Time Series Stock Data
When you need Bitcoin data you go to Binance; for stock price data Yahoo Finance is the place to go!
Whether you want to train a model or study the history of a stock ticker like Apple (AAPL), Microsoft (MSFT) or Amazon (AMZN) you need data, historical data and preferably as complete as possible.
A lot of major companies like General Motors (GM) have been around for quite some time. On the other hand there are also startups and more recent firms like Super Micro (SMCI) or Rigetti Computing (RGTI) that have often been around for only a few years. For most stock like assets, you would want the data from the last 2 years including the most recent.
Time series
We are talking about so-called time series, a collection of data points. In these we record information per unit of time. Such as a day, an hour or a minute. For financial assets, such as stock or crypto, it concerns price data. The opening, closing, high and low price values for the relevant unit of time, plus the traded volume and the number of transactions per that unit of time.
You can download historical stock price data from many places on the internet. One of the best places where to get a decent amount of data of almost every popular asset still is the well known Yahoo Finance portal.
Yahoo Finance for Stock Data
Below we explain how you can best download the historical data you need from Yahoo Finance and store it locally in a default formatted comma separated (.csv) file for later use. We use a standard format for storing time series data for financial assets. This makes it possible to use the same (Python or R) software for technical analysis (TA) or training machine learning models for both crypto currencies price data and stock price data.
For downloading we no longer make use of the existing Python library yfinance after experiencing deep buried, curl related access violations when finishing worker threads. We now use just the native requests library with the ‘naked’ API. You will also need libraries for calculating the correct timestamps and some basic data manipulation. For the details see below.
The GetStock class is more or less an empty shell to use the Python GUI PyQt and save the data as standard .csv file, but of course you don’t have to use it, the actual code can be used as a standalone script.
GetStock Class
# Copyright (c) 2024 Hans De Weme
# Licensed under the MIT License (https://opensource.org/licenses/MIT).
# Class GetStock
#@ Purpose: retrieve current horly stock price data for the last 2 years from Yahoo Finance, provide a regular hourly datatime index and save as csv.file
# YAHOO FINANCE https://github.com/ranaroussi/yfinance/wiki/Ticker
from get_yahoo_data import _get_yahoo_hourly_data
from PyQt6.QtCore import QObject, pyqtSignal # use to integrate in PyQt GUI applications
# class init arguments:
# asset - asset to download the hourly ticker data for
class GetStock(QObject):
get_stock_successful = pyqtSignal(str) # Signal to indicate asset dumped successfully in CWD
def __init__(self, asset):
super().__init__() # necessary for QObject, needed for pyqtSignal
self.asset = asset
print('OK!, LETS GO FOR: '+self.asset)
def load_data(self):
df = _get_yahoo_hourly_data(ticker=self.asset, colmn='all', days=730)
print(df)
print(df.info())
df.to_csv(self.asset+'-total.csv', index = True)
self.get_stock_successful.emit(self.asset)Generic Method: _get_yahoo_data.py
The real action happens inside the _get_yahoo_data method.

Imports
# Copyright (c) 2025 Hans De Weme
# Licensed under the MIT License (https://opensource.org/licenses/MIT).
# Generic Method: _get_yahoo_hourly_data
# Purpose: collect OHLC stock data from Yahoo Finance API
# Note: this can be any stock ticker like AAPL, but also an index such as the'^GSPC' S&P 500 index, or the ^VIX' CBOE Volatility Index
# Returns a DataFrame with the preferred timezone datetime index named 'dt, if column parameter == 'all', the complete OHLC record is returned
import requests
import pandas as pd
import time
from datetime import datetime, timedelta, timezone
from time_handle import handleTime
PythonThe Method
def _get_yahoo_hourly_data(ticker, colmn, days=730):
# max tries
max_tries = 3
# fake a browser to avoid Yahoo limitations on direct API users
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"
}
# create Unix timestamp to enable max hourly dataset of 2 years (730 days( up to the current hour
end_time = int(datetime.now(timezone.utc).timestamp())
start_time = int((datetime.now(timezone.utc) - timedelta(days=days)).timestamp())
# call url
url = f"https://query1.finance.yahoo.com/v8/finance/chart/{ticker}"
params = {
"period1": start_time,
"period2": end_time,
"interval": "1h",
"includePrePost": "false",
"events": "div,splits"
}
# make requests
for attempt in range(max_tries):
response = requests.get(url, params=params, headers=headers)
if response.status_code == 429:
print(f"[WARN] Rate limited by Yahoo for {ticker}. Retrying...")
time.sleep(3 * (attempt + 1))
else:
break
# handle response
try:
response.raise_for_status()
data = response.json()
result = data['chart']['result'][0]
timestamps = result['timestamp']
quote = result['indicators']['quote'][0]
df = pd.DataFrame({'dt': pd.to_datetime(timestamps, unit='s', utc=True)})
if colmn == 'all':
for key in ['open', 'high', 'low', 'close', 'volume']:
if key in quote:
df[key.capitalize()] = quote[key]
df.drop(columns=['Adj Close'], inplace=True, errors='ignore')
df['number_of_trades'] = 0.0
df.rename(columns={
'Open': 'open',
'High': 'high',
'Low': 'low',
'Close': 'close',
'Volume': 'volume'
}, inplace=True)
else:
df[colmn] = quote['close']
df.dropna(inplace=True)
df['dt'] = df['dt'] + pd.DateOffset(minutes=30)
df = df.sort_values('dt')
df.set_index('dt', inplace=True)
# Convert to preferred time zone
time_handle = handleTime("settings.json")
df = time_handle.convert_dataframe_timezone(df, time_handle.tzone, original_tz='UTC')
return df
except Exception as e:
print(f"[ERROR] Failed to fetch Yahoo Finance data for {ticker}: {e}")
return None
PythonCSV File Details
Format:
- Index: Regular hourly datetime index.
- Columns:
- open: Opening price.
- high: Highest price.
- low: Lowest price.
- close: Closing price.
- volume: Trading volume.
- number_of_trades: Default set to 0.0.
File Naming:
- <ticker>-total.csv
Technical Notes
Timezone Handling:
- Adjusts timestamps to preferred time-zone for market hours.
Dependencies:
- Relies on Yahoo Finance API via the native requests library.
- Handles API rate limits and potential downtime using robust error handling.
License
This script is licensed under the MIT License.