This repository provides an introduction to key Python libraries used in data science, including their installation and basic usage. It covers libraries such as NumPy
, pandas
, Matplotlib
, Scikit-learn
, TensorFlow
, and tools for web scraping. Each section includes installation commands, basic commands, and a simple example to help you get started.
You can install the required libraries using pip
. Run the following command:
pip install numpy pandas matplotlib scikit-learn tensorflow beautifulsoup4 requests
This will install:
- NumPy for numerical computing
- pandas for data manipulation
- Matplotlib for data visualization
- scikit-learn for machine learning
- TensorFlow for deep learning
- BeautifulSoup and requests for web scraping
NumPy is the fundamental package for numerical computing in Python. It provides support for arrays, matrices, and many mathematical functions.
pip install numpy
import numpy as np
# Create an array
arr = np.array([1, 2, 3])
print(arr)
# Perform operations
print(np.mean(arr))
pandas is an open-source library that provides high-performance data manipulation and analysis tools, particularly DataFrames.
pip install pandas
import pandas as pd
# Create a DataFrame
data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)
# Display the DataFrame
print(df)
# Perform basic operations
print(df.describe())
Matplotlib is a plotting library used for creating static, interactive, and animated visualizations in Python.
pip install matplotlib
import matplotlib.pyplot as plt
# Create a simple plot
plt.plot([1, 2, 3], [4, 5, 6])
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Simple Plot')
plt.show()
BeautifulSoup and requests are essential libraries for web scraping, allowing you to extract data from websites.
pip install beautifulsoup4 requests
import requests
from bs4 import BeautifulSoup
# Fetch content from a webpage
url = 'https://example.com'
response = requests.get(url)
# Parse HTML content
soup = BeautifulSoup(response.text, 'html.parser')
print(soup.title.text)
Scikit-learn is a library for machine learning, offering simple and efficient tools for data analysis and modeling.
pip install scikit-learn
from sklearn.linear_model import LinearRegression
import numpy as np
# Sample data
X = np.array([[1], [2], [3]])
y = np.array([2, 4, 6])
# Create a model and fit it
model = LinearRegression()
model.fit(X, y)
# Make a prediction
print(model.predict([[4]]))
TensorFlow is an open-source platform for machine learning and deep learning, commonly used for building neural networks.
pip install tensorflow
import tensorflow as tf
# Create a constant tensor
hello = tf.constant('Hello, TensorFlow!')
print(hello.numpy())
If you'd like to contribute, feel free to fork the repository and submit a pull request. For major changes, please open an issue to discuss what you would like to change.
This project is licensed under the MIT License.