How to Create Histogram in Python

How to Create Histogram in Python

In this tutorial, we will learn about Python Histograms. A histogram provides a straightforward method for visually displaying data when you’ve organized it into groups. It effectively illustrates the distribution of numerical data. In essence, it functions as a type of bar plot where the X-axis indicates the ranges, or “bins,” and the Y-axis reveals the frequency of each bin.

A histogram graphically represents data distribution. In Python, you can craft histograms with various libraries, but Matplotlib. It is one of the most commonly used libraries for this purpose. It offers a convenient way to create histograms and tailor their appearance.

Creating Python Histogram using NumPy and matplotlib

To create a histogram in Python with attributes in a tabular form, you typically use a library like Matplotlib to customize various aspects of the histogram. You can specify attributes like the number of bins, colors, labels, and more.

Different Attributes :

AttributeDescriptionExample
dataThe data you want to create a histogram for.data = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
binsNumber of bins (bars) in the histogram.bins = 5
colorColor of the bars in the histogram.
color = ‘blue’
alphaTransparency of bars (0.0 to 1.0).alpha = 0.7
edgecolorColor of the bar edges.edgecolor = ‘black’
xlabelLabel for the X-axis.plt.xlabel(‘X-axis Label’)
ylabelLabel for the Y-axis.plt.ylabel(‘Frequency’)
titleTitle of the histogram.plt.title(‘Histogram Example’)
gridDisplay grid lines on the plot.plt.grid(True)
densityNormalize the histogram (sum of frequencies = 1).density = True
cumulativeCreate a cumulative histogram.cumulative = True
histtypeType of histogram (‘bar’, ‘barstacked’, etc.).histtype = ‘step’
rangeSpecify the range of values to consider.range = (0, 10)
logCreate a logarithmic histogram.log = True
binsizeSpecify custom bin edges.binsize = [0, 2, 4, 6, 8, 10]
weightsAssign weights to data points.weights = [0.5, 0.2, 0.3, 0.8, 0.7]

Example :

import matplotlib.pyplot as plt
import numpy as np

# Generate some random data (you can replace this with your own data)
data = np.random.randn(1000)  # Random data with a normal distribution

# Create a histogram
plt.hist(data, bins=20, color='blue', alpha=0.7)
plt.xlabel('X-axis Label')
plt.ylabel('Frequency')
plt.title('Histogram Example')
plt.grid(True)

# Display the histogram
plt.show()

Output :

  • We use necessary libraries: Matplotlib’s pyplot for creating plots and NumPy for generating random data.
  • We generate random data using NumPy’s numpy.random.randn. Feel free to replace this with your own dataset or load data from a file for specific visualization needs.
  • We construct the histograms using plt.hist(). The ‘bins’ parameter allows you to control the number of bars in the histograms, providing flexibility to adjust the granularity.
  • We include labels for the X and Y axes and assign a title to the plot.
  • To enhance visualization, we enable grid lines with plt.grid(True).
  • Finally, we reveal the histogram by executing plt.show().

How to Customize Python Histogram in Matplotlib

Matplotlib offers various methods for customizing histograms. Notably, the matplotlib.pyplot.hist() function offers an array of attributes that allow us to tailor a histogram to our specific needs. This function also provides a ‘patches’ object, granting access to the properties of the created objects. Utilizing this, we can easily make further adjustments to the plot as desired.

import matplotlib.pyplot as plt
import numpy as np

# Generate random data with a normal distribution
data = np.random.randn(1000)

# Create a histogram with custom attributes
plt.hist(data, bins=30, color='green', alpha=0.6, edgecolor='black', density=True, cumulative=True)
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Customized Histogram Example')
plt.grid(True)

# Display the histogram
plt.show()

Output :

How to Create Multiple Histograms in Python

Constructing multiple histograms in Python with Matplotlib proves to be a valuable technique when comparing distributions of various datasets within a single graph.

import matplotlib.pyplot as plt
import numpy as np

# Generate two sets of random data
data1 = np.random.normal(0, 1, 1000)  # Data with mean 0 and standard deviation 1
data2 = np.random.normal(2, 1, 1000)  # Data with mean 2 and standard deviation 1

# Create histograms for both datasets
plt.hist(data1, bins=20, color='blue', alpha=0.5, label='Data 1')
plt.hist(data2, bins=20, color='red', alpha=0.5, label='Data 2')

# Customize the plot
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Multiple Histograms Example')
plt.legend()  # Display legend to differentiate between data1 and data2
plt.grid(True)

# Display the histogram
plt.show()

Output :

  • First, we import the necessary libraries, which include Matplotlib and NumPy.
  • Next, we generate two sets of random data using NumPy’s np.random.normal function. Specifically, we create ‘data1’ by sampling from a normal distribution with a mean of 0 and a standard deviation of 1. ‘data2’ is generated from a normal distribution with a mean of 2 and a standard deviation of 1.
  • Moving on, we proceed to create histograms for both datasets employing plt.hist. Within this step, we specify various parameters such as the number of bins, colors, and alpha (which controls transparency) for each histogram. We also utilize the ‘label’ parameter to assign labels for the legend, aiding in dataset differentiation.
  • Afterward, we enhance the visualization by including labels for the X and Y axes, setting a plot title, and enabling grid lines.
  • To distinguish between the two datasets in our plot, we effectively employ plt.legend() to display a legend.
  • Finally, to showcase our histogram, we utilize plt.show().

Discover Our Exciting Courses and Quiz

Enroll now to enhance your skills and knowledge!

Python Online Quiz

Level up your coding skills with our interactive programming quiz!