stockdada.com

Exploring BankNifty Movements: A Data Analysis

Welcome to our exploration of BankNifty movements using data analysis techniques! In this blog post, we’ll dive into the fascinating world of financial data and uncover insights into the movements of BankNifty, a key index in the Indian stock market. Through the power of Python programming and data visualization libraries, we’ll dissect BankNifty data, understand its patterns, and gain valuable insights that can inform trading decisions.

for deatil of Banknifty Analisys read blog UNLOCKING INSIGHTS : A COMPREHENSIVE BANKNIFTY ANALYSIS

BankNifty is a vital indicator of the performance of banking stocks listed on the National Stock Exchange of India (NSE). Understanding its movements is crucial for investors, traders, and financial analysts seeking to navigate the dynamic landscape of the stock market.

				
					# Mount Google Drive to access the dataset
from google.colab import drive
drive.mount('/content/drive')

				
			
  • This code snippet imports the drive module from the google.colab library in Google Colab.
  • Google Colab provides a feature to mount Google Drive within the Colab environment, allowing access to files stored in Google Drive directly from the Colab notebook.
  • The drive.mount('/content/drive') command mounts the root directory of the user’s Google Drive at the specified path (/content/drive) within the Colab environment.
  • After executing this command, users can access files stored in their Google Drive by navigating to the specified path within the Colab notebook.
				
					import pandas as pd

				
			
  • pandas is a powerful data manipulation library in Python.
  • It provides data structures and functions to efficiently work with structured data, such as data frames.
  • With pandas, you can load, manipulate, filter, aggregate, and analyze data easily.
				
					import matplotlib.pyplot as plt


				
			
  • matplotlib.pyplot is a plotting library for Python.
  • It provides a MATLAB-like interface for creating a variety of plots and visualizations, including line plots, bar charts, histograms, scatter plots, and more.
  • Matplotlib is highly customizable, allowing you to control aspects like colors, labels, axes, and annotations.
				
					import plotly.express as px


				
			
  • plotly.express is part of the Plotly library, which is used for interactive data visualization.
  • Plotly allows you to create interactive plots with features like zooming, panning, hovering, and tooltips.
  • Plotly Express is a high-level interface for creating a variety of plot types quickly and easily, without the need for extensive coding
				
					# Read the BankNifty data from a CSV file into a DataFrame
df = pd.read_csv('exampel.csv", index_col=0)

# Reset the index of the DataFrame for consistency
df1.reset_index(drop=True, inplace=True)

# Display the first row of the DataFrame for preview
df1.head(1)

# Convert the 'datetime' column to datetime format
# This allows us to work with dates and times more easily
df1["datetime"] = pd.to_datetime(df1["datetime"])



				
			
  1. Read the CSV file:
    • Load the BankNifty data from a CSV file into a DataFrame named df.
    • The index_col=0 parameter specifies that the first column should be used as the index of the DataFrame.
  2. Reset the index:
    • Resetting the index of df1 ensures that the index is sequential starting from 0.
    • This is useful for maintaining consistency and avoiding potential issues with data manipulation.
  3. Display the first row:
    • Displaying the first row of the DataFrame allows us to preview the data and check if the reset index is applied correctly.
  4. Convert ‘datetime’ column:
    • Convert the ‘datetime’ column to datetime format using the pd.to_datetime() function.
    • This conversion makes it easier to perform time-based operations and analysis on the data.
				
					# Set the 'datetime' column as the index of the DataFrame
# This allows us to easily work with time-series data
df1.set_index("datetime", inplace=True)

# Resample the data at 30-minute intervals and aggregate the prices
# This groups the data into 30-minute intervals and calculates summary statistics
df2 = df1.resample('30T').agg({
    "Open": "first",   # Take the first value of 'Open' within each 30-minute interval
    "High": "max",     # Take the maximum value of 'High' within each 30-minute interval
    "Low": "min",      # Take the minimum value of 'Low' within each 30-minute interval
    "Close": "last"   # Take the last value of 'Close' within each 30-minute interval
})

				
			
  1. Set index to ‘datetime’:
    • The set_index() function is used to set the ‘datetime’ column as the index of the DataFrame df1.
    • This allows us to easily perform time-based operations and analysis on the data.
  2. Resample data at 30-minute intervals:
    • The resample() function is used to group the data into 30-minute intervals (’30T’ stands for 30 minutes).
    • Within each 30-minute interval, summary statistics are calculated for the ‘Open’, ‘High’, ‘Low’, and ‘Close’ prices.
    • The agg() function is used to specify the aggregation method for each column:
      • ‘first’ is used to take the first value of ‘Open’ within each interval.
      • ‘max’ is used to find the maximum value of ‘High’ within each interval.
      • ‘min’ is used to find the minimum value of ‘Low’ within each interval.
      • ‘last’ is used to take the last value of ‘Close’ within each interval.
				
					# Drop rows with any missing values (NaNs) from the DataFrame
# This ensures that the data is clean and complete for analysis
df2.dropna(inplace=True)

# Calculate the price movement for each 30-minute interval
# The 'move' column represents the difference between the closing and opening prices
df2["move"] = df2.Close - df2.Open

# Convert the price movement to percentage change relative to the closing price
# The 'move_per' column represents the percentage change in price for each interval
df2["move_per"] = (df2["move"] / df2["Close"]) * 100

# Round the percentage change to two decimal places for readability
# This makes the 'move_per' column easier to interpret
df2["move_per"] = df2["move_per"].round(2)


				
			
  1. Drop missing values:
    • The dropna() function is used to remove any rows with missing values (NaNs) from the DataFrame df2.
    • This ensures that the data is complete and ready for analysis, as missing values can affect the accuracy of calculations.
  2. Calculate price movement:
    • The ‘move’ column is created to represent the difference between the closing (‘Close’) and opening (‘Open’) prices for each 30-minute interval.
    • This calculates how much the price has moved within each interval, indicating whether it has increased or decreased.
  3. Convert to percentage change:
    • The ‘move_per’ column is created to represent the percentage change in price for each interval relative to the closing price.
    • This calculates the percentage change in price movement, providing insights into the magnitude of price fluctuations.
  4. Round percentage change:
    • The round() function is used to round the percentage change to two decimal places.
    • This improves the readability of the data by reducing the number of decimal places, making it easier to interpret and compare.
				
					# Initialize a list 'Range' with zeros, with the same length as the DataFrame 'df2'
Range = [0] * len(df2)

# Loop through each row in the DataFrame
for row in range(len(df2)):
    # Check if the price movement percentage is positive and within the range 0.00% to 0.50%
    if (df2.move_per.iloc[row] > 0) and (df2.move_per.iloc[row] <= 0.50):
         Range[row] = 1
    # Check if the price movement percentage is positive and within the range 0.50% to 1.00%
    elif (df2.move_per.iloc[row] > 0.50) and (df2.move_per.iloc[row] <= 1):
         Range[row] = 2
    # Check if the price movement percentage is positive and within the range 1.00% to 1.50%
    elif (df2.move_per.iloc[row] > 1) and (df2.move_per.iloc[row] <= 1.50):
         Range[row] = 3
    # Check if the price movement percentage is positive and within the range 1.50% to 2.00%
    elif (df2.move_per.iloc[row] > 1.50) and (df2.move_per.iloc[row] <= 2):
         Range[row] = 4
    # Check if the price movement percentage is positive and greater than 2.00%
    elif (df2.move_per.iloc[row] > 2):
         Range[row] = 5
    # Check if the price movement percentage is negative and within the range -0.50% to 0.00%
    elif (df2.move_per.iloc[row] < 0) and (df2.move_per.iloc[row] >= -0.50):
         Range[row] = 6
    # Check if the price movement percentage is negative and within the range -1.00% to -0.50%
    elif (df2.move_per.iloc[row] < -0.50) and (df2.move_per.iloc[row] >= -1):
         Range[row] = 7
    # Repeat similar checks for other negative percentage ranges
    # You can add comments for other conditions as well, like above
    # If none of the conditions above are met, assign the value 10 to 'Range'
    else:
         Range[row] = 10
# Assign the 'Range' list as a new column 'range' in the DataFrame 'df2'
df2["range"] = Range
				
			
  • The code assigns a numerical range to each row in the DataFrame based on the percentage change in price movement.
  • Positive percentage changes are categorized into ranges 1 to 5, and negative percentage changes are categorized into ranges 6 to 10.
  • If the percentage change falls outside of the predefined ranges, it is assigned to range 10.
  • The assigned ranges are stored in a new column named ‘range’ in the DataFrame ‘df2’.
				
					def visualize_data(hours, minute, df, title):
    # Convert input time into datetime format
    Time = pd.to_datetime(f"{hours}:{minute}:00", format=("%H:%M:%S")).time()
    # Filter rows in the DataFrame 'df' for the specified time
    # Create a new DataFrame 'df3' to store the filtered data
    df3 = df[df["Datetime"].dt.time == Time]
    # Map numerical ranges to descriptive labels for better visualization
    df3["range_label"] = df3["range"].map({
        1: "0.00% to 0.50%",
        2: "0.50% to 1.00%",
        3: "1.00% to 1.50%",
        4: "1.50% to 2.00%",
        5: "> 2.00%",
        6: "-0.50% to 0.00%",
        7: "-1.00% to -0.50%",
        8: "-1.50% to -1.00%",
        9: "-2.00% to -1.50%",
        10: "< -2.00%"
    })
    # Create a pie chart using Plotly Express to visualize the distribution of price ranges
    fig1 = px.pie(df3, values="range", names="range_label", title=title)
    fig1.update_layout(margin=dict(l=0, r=0), title={"x": 0.5})  # Update layout for better visualization
    fig1.show()  # Display the pie chart
    # Print the counts of price movements above 1%, 0.50%, below -1%, and -0.50%
    print('Value above 1%:', df3[df3['move_per'] > 1]['move_per'].value_counts().sum())
    print('Value above 0.50%:', df3[df3['move_per'] > 0.50]['move_per'].value_counts().sum())
    print('Value below -1%:', df3[df3['move_per'] < -1]['move_per'].value_counts().sum())
    print('Value below -0.50%:', df3[df3['move_per'] < -0.50]['move_per'].value_counts().sum())

				
			
  • The function visualize_data() takes four parameters: hours, minute, df, and title.
  • It filters the DataFrame df based on the specified time (hours and minute), creates a new DataFrame df3 for the filtered data, and assigns descriptive labels to numerical ranges.
  • It creates a pie chart using Plotly Express to visualize the distribution of price ranges and displays it.
  • It prints the counts of price movements above 1%, 0.50%, below -1%, and -0.50%,
				
					
visualize_data(15, 0, df2, "banknifty 15:00 to 15:30")
				
			
  • The function visualize_data() takes four parameters: hours, minute, df, and title.

Leave a Comment

Your email address will not be published. Required fields are marked *

Related Posts