2.2 hacks
2.3 hacks
- In the Blog add notes and observations on each code cell.
In blog add College Board practice problems for 2.3.
Create or Find your own dataset.
- dataframe:
  - output (tested in a seprate .py file):

2.2 hacks

AP Prep

In the Blog add notes and observations on each code cell that request an answer.

code block 2

image_data(): prepares a series of images and returns a list of image dictionaries.
scale_image(): scales a PIL image to a base width of 320 and returns the resized image.
image_to_base64(): converts a PIL image to a base64 representation.
image_management(): sets properties of an image, scales it, and converts it to base64.
image_management_add_html_grey(): creates a grayscale representation of an image and adds it as HTML.
if name == “main”:: checks whether the script is being run directly or being imported.
Displays meta data, scaled view, and grayscale for each image in the images list.

code block 3

The code defines a Image_Data class that encapsulates image data and methods to manipulate images.
The Image_Data constructor initializes the object with source, label, file, path, and baseWidth attributes.
The scale_image() method scales an image to the desired width.
The image_to_html() method converts a PIL image to base64 HTML code.
The image_to_html_grey() method converts a PIL image to grayscale.
The image_data() function prepares a series of images and returns the path and images.
The image_objects() function creates objects of Image_Data for each image and returns a list of them.
The if name == “main”: section prints the metadata, scaled image, and grayscale image of each image object.

In blog add College Board practice problems for 2.2

2.2

Choose 2 images, one that will more likely result in lossy data compression and one that is more likely to result in lossless data compression. Explain.

the image that might loss color as a result of lossy compression is this image: this is becuase it has a lot of color and it will look more pixalated when it is compressed.

the image that might not loose much of its meta-data might be this image: this is because the picture does not have varying shades of color and therefore will bnot look as pixelated as the first picture.

Project Addition

nope, out project does not have any images in it.

Pick a programming paradigm and solve some of the following …

Numpy, manipulating pixels.

import numpy as np
from PIL import Image

# load the image and convert to a Numpy array
img = Image.open("image.png")
img_arr = np.array(img)

# define a function to manipulate the image by a certain color channel
def manipulate_color_channel(img_arr, color_channel, intensity):
    # copy the original array to avoid modifying it directly
    manipulated_arr = img_arr.copy()
    # index the color channel and multiply by the desired intensity
    manipulated_arr[:, :, color_channel] *= intensity
    # make sure the pixel values are within the valid range of 0-255
    manipulated_arr = np.clip(manipulated_arr, 0, 255)
    # convert the array back to an image
    manipulated_img = Image.fromarray(manipulated_arr.astype(np.uint8))
    return manipulated_img

# manipulate the image using the red color channel
manipulated_img = manipulate_color_channel(img_arr, 0, 2)
manipulated_img.show()

# manipulate the image using the green color channel
manipulated_img = manipulate_color_channel(img_arr, 1, 1.5)
manipulated_img.show()

# manipulate the image using the blue color channel
manipulated_img = manipulate_color_channel(img_arr, 2, 0.5)
manipulated_img.show()

Binary and Hexadecimal reports.

import cv2
import numpy as np

# Read the image in grayscale
img = cv2.imread('example.jpg', cv2.IMREAD_GRAYSCALE)

# Convert to red scale
red_img = np.zeros_like(img)
red_img[:, :, 0] = img

# Display pixel values in binary and hexadecimal formats
for i in range(10):
    for j in range(10):
        pixel_value = red_img[i, j]
        binary_value = '{0:b}'.format(pixel_value)
        hex_value = '{0:x}'.format(pixel_value)
        print(f"Pixel value: {pixel_value}, Binary value: {binary_value}, Hex value: {hex_value}")

Compression and Sizing of images.

from PIL import Image

# Open the original image
img = Image.open('original_image.jpg')

# Save the image in PNG format with maximum compression
img.save('compressed_image.png', optimize=True, compress_level=9)

Blur the image or write Meta Data on screen, aka Title, Author and Image size.

from PIL import Image, ImageDraw, ImageFont, ImageFilter

class ImageEditor:
    def __init__(self, file_path):
        self.image = Image.open(file_path)
        self.width, self.height = self.image.size

    def blur(self, radius=10):
        self.image = self.image.filter(ImageFilter.GaussianBlur(radius))

    def add_meta_data(self, title, author):
        font = ImageFont.truetype('arial.ttf', size=20)
        draw = ImageDraw.Draw(self.image)
        draw.text((10, 10), f'Title: {title}', font=font, fill='white')
        draw.text((10, 40), f'Author: {author}', font=font, fill='white')
        draw.text((10, self.height - 30), f'Image size: {self.width} x {self.height}', font=font, fill='white')

    def save_image(self, file_path):
        self.image.save(file_path)

# Example usage
editor = ImageEditor('example_image.jpg')
editor.blur(radius=15)
editor.add_meta_data('Beautiful Scenery', 'John Doe')
editor.save_image('edited_image.jpg')

2.3 hacks

In the Blog add notes and observations on each code cell.

Code Block 1:

print(df[['GPA']])
# This code prints a DataFrame containing only the 'GPA' column of the original DataFrame 'df'. 

print()
# prints empy line.

print(df[['Student ID','GPA']].to_string(index=False))
# This code prints a DataFrame containing the 'Student ID' and 'GPA' columns of the original DataFrame 'df', 
# with the index removed from the printed output.

Code Block 2:

print(df.sort_values(by=['GPA']))
# This code sorts the values in the original DataFrame 'df' by ascending order of the 'GPA' column, 
# and prints the sorted DataFrame.

print()
# empty line

print(df.sort_values(by=['GPA'], ascending=False))
# This code sorts the values in the original DataFrame 'df' by descending order of the 'GPA' column, 
# and prints the sorted DataFrame.

Code Block 3:

print(df[df.GPA > 3.00])
# This code filters the rows of the original DataFrame 'df' where the value of the 'GPA' column is greater than 3.00, 
# and prints the resulting filtered DataFrame. In other words, it selects and displays all rows where the value in 
# the 'GPA' column is greater than 3.00.

Code Block 4:

print(df[df.GPA == df.GPA.max()])
# This code filters the rows of the original DataFrame 'df' where the value of the 'GPA' column is equal to the maximum
# GPA value in the DataFrame, and prints the resulting filtered DataFrame.

print()
# empty line


print(df[df.GPA == df.GPA.min()])
# This code filters the rows of the original DataFrame 'df' where the value of the 'GPA' column is equal to the minimum 
# GPA value in the DataFrame, and prints the resulting filtered DataFrame.

Code Block 4:

import pandas as pd

# A python dictionary is defined with two keys 'calories' and 'duration', and their respective values as lists.
dict = {
  "calories": [420, 380, 390],
  "duration": [50, 40, 45]
}

# The python dictionary 'dict' is converted to a pandas DataFrame using the DataFrame() constructor.
# The resulting DataFrame 'df' has two columns, 'calories' and 'duration', and three rows with default integer index.
print("-------------Dict_to_DF------------------")
df = pd.DataFrame(dict)
print(df)

# The 'index' parameter of the DataFrame() constructor is used to label the rows of the DataFrame.
# The resulting DataFrame 'df' has two columns, 'calories' and 'duration', and three rows labeled as 'day1', 'day2', and 'day3'.
print("----------Dict_to_DF_labels--------------")
df = pd.DataFrame(dict, index = ["day1", "day2", "day3"])
print(df)

Code Block 5:

print("-------Examine Selected Rows---------")

# The .loc[] method is used to select multiple rows with labels 'day1' and 'day3'.
# The resulting DataFrame includes all columns of the original DataFrame 'df' for the selected rows.
print(df.loc[["day1", "day3"]])

# The .loc[] method is used to select a single row with label 'day1'.
# The resulting DataFrame includes all columns of the original DataFrame 'df' for the selected row.
print("--------Examine Single Row-----------")
print(df.loc["day1"])

Code Block 6:

import pandas as pd

# Read a CSV file 'data.csv' using the pandas library and sort the DataFrame based on the 'Duration' column.
df = pd.read_csv('files/data.csv').sort_values(by=['Duration'], ascending=False)

# Output the top 10 rows with the highest 'Duration' values.
print("--Duration Top 10---------")
print(df.head(10))

# Output the bottom 10 rows with the lowest 'Duration' values.
print("--Duration Bottom 10------")
print(df.tail(10))

Code Block 7:

# This code fetches data from a COVID-19 API endpoint using the requests library. The obtained data is then filtered to create a pandas DataFrame containing country statistics. 

'''Pandas can be used to analyze data'''
import pandas as pd
import requests

def fetch():
    '''Obtain data from an endpoint'''
    url = "https://flask.nighthawkcodingsociety.com/api/covid/"
    fetch = requests.get(url)
    json = fetch.json()

    # filter data for requirement
    df = pd.DataFrame(json['countries_stat'])  # filter endpoint for country stats
    print(df.loc[0:5, 'country_name':'deaths']) # show row 0 through 5 and columns country_name through deaths
    
fetch()

In blog add College Board practice problems for 2.3.

2.3

Create or Find your own dataset.

{
    "houses": [
        {
            "id": 1,
            "location": "Los Angeles",
            "bedrooms": 3,
            "bathrooms": 2,
            "square_feet": 1800,
            "price": 950000
        },
        {
            "id": 2,
            "location": "San Francisco",
            "bedrooms": 2,
            "bathrooms": 1,
            "square_feet": 1200,
            "price": 1200000
        },
        {
            "id": 3,
            "location": "New York City",
            "bedrooms": 4,
            "bathrooms": 3,
            "square_feet": 2500,
            "price": 1800000
        },
        {
            "id": 4,
            "location": "Chicago",
            "bedrooms": 3,
            "bathrooms": 2,
            "square_feet": 2000,
            "price": 700000
        },
        {
            "id": 5,
            "location": "Seattle",
            "bedrooms": 5,
            "bathrooms": 3,
            "square_feet": 3000,
            "price": 2200000
        },
        {
            "id": 6,
            "location": "Boston",
            "bedrooms": 2,
            "bathrooms": 1,
            "square_feet": 1000,
            "price": 750000
        },
        {
            "id": 7,
            "location": "Austin",
            "bedrooms": 4,
            "bathrooms": 2,
            "square_feet": 2200,
            "price": 900000
        },
        {
            "id": 8,
            "location": "Denver",
            "bedrooms": 3,
            "bathrooms": 2,
            "square_feet": 1900,
            "price": 650000
        },
        {
            "id": 9,
            "location": "Miami",
            "bedrooms": 2,
            "bathrooms": 1,
            "square_feet": 1100,
            "price": 550000
        },
        {
            "id": 10,
            "location": "Portland",
            "bedrooms": 4,
            "bathrooms": 2,
            "square_feet": 2400,
            "price": 1100000
        }
    ]
}

dataframe:

import pandas as pd
 
# Load the JSON data into a DataFrame
df = pd.DataFrame(housing_data)

print("-----Most cheapest houses in the united states------")
df_sorted = df.sort_values(by='price')

print(df_sorted)

output (tested in a seprate .py file):

   house_id  num_bedrooms  num_bathrooms  square_feet  garage_capacity  year_built  sale_price
       1             3              2         1500                2        2000      250000
       2             4              3         2500                3        2010      500000
       3             2              1         1000                1        1995      150000
       4             3              2         1800                2        2005      300000
       5             2              1          900                1        1980      125000