December 2023 – siddharthjain

Hello,

We have started to work on project report. update as of today, we have finalized our issues, discussion and result section. Planning to complete remaining sections by tomorrow.

December 4, 2023December 10, 2023

Dec 4,2023

Hello,

In this section, I conducted Time Series Decomposition for two key economic indicators: Hotel Occupancy Rates and Logan Airport Passengers. Here’s a breakdown of the process and the results:

– I created a combined date column by merging the “Year” and “Month” columns to create a datetime column. This datetime column was set as the index for the dataset, making it easier to perform time series analysis.

– I selected two economic indicators for analysis: Hotel Occupancy Rates and Logan Airport Passengers.
– For each indicator, I performed seasonal decomposition using the `seasonal_decompose` function from `statsmodels.tsa.seasonal`. I used an additive model and specified a period of 12 months, indicating that the seasonality repeats annually.

I plotted the decomposition results for both indicators, with each plot displaying four components:

1. Observed Data: This plot shows the actual values of the economic indicator over time, in this case, Hotel Occupancy Rates and Logan Airport Passengers.

2. Trend Component: The trend plot reveals the underlying trend or pattern in the data. It helps identify whether the indicator is generally increasing, decreasing, or following a specific pattern.

3. Seasonal Component: The seasonal plot displays the recurring patterns or seasonality in the data. It helps identify any regular fluctuations that occur at specific times of the year.

4. Residuals: The residuals plot represents the remaining variation in the data after removing the trend and seasonal components. It can provide insights into irregularities or unexpected changes in the data.

Formatting and Visualization:
– To enhance readability, I formatted the x-axis of the plots to display years using the `mdates` module. This makes it easier to identify trends and seasonality over time.

Interpretation:
– Time series decomposition is a valuable technique for understanding the underlying patterns and components within economic indicators. It allows us to separate the data into its constituent parts, making it easier to identify trends, seasonality, and irregularities.

– By examining the decomposition plots, we can gain insights into how Hotel Occupancy Rates and Logan Airport Passengers vary over time. This information can be crucial for making informed decisions and strategic planning in various sectors, including tourism and hospitality.

Overall, time series decomposition is a powerful tool for uncovering meaningful patterns within economic data, enabling better analysis and forecasting.

December 1, 2023December 10, 2023

Dec 1,2023

Hello,
In this section, I conducted a clustering analysis to identify patterns within the dataset. Here’s a summary of the steps and outcomes:

To prepare the data for clustering, I started by normalizing the dataset. Normalization ensures that all variables have the same scale, which is crucial for meaningful clustering results. The `StandardScaler` from `sklearn.preprocessing` was used to standardize the data. I excluded the “Year” and “Month” columns from the normalization process.

To determine the optimal number of clusters for the K-means algorithm, I employed the Elbow Method. This method involves running K-means clustering for a range of cluster numbers (from 1 to 10) and calculating the Within-cluster Sum of Squares (WCSS) for each. The WCSS measures the variability within clusters. I plotted the results of the Elbow Method, and the point at which the decrease in WCSS starts to level off represents the optimal number of clusters.

Based on the Elbow Method analysis, I performed K-means clustering with two different cluster numbers: 3 and 4. The `KMeans` class from `sklearn.cluster` was utilized for this purpose. The models were initialized using the “k-means++” method for better convergence, and a fixed random state (random_state=42) was set for reproducibility.

After fitting the models, I assigned data points to clusters using the `fit_predict` method. Two sets of data were created, one with 3 clusters and another with 4 clusters.

I added the cluster information back to the original dataset for further analysis. This allowed me to understand which cluster each data point belonged to, providing insights into the distinct groups or patterns within the data.

To provide a glimpse of the results, I displayed the first few rows of each dataset with cluster information. This inspection offers a preliminary view of how data points are distributed among clusters.

Clustering analysis helps identify inherent structures or groups within the dataset, enabling a deeper understanding of data patterns and trends. It can be a valuable tool for segmentation and decision-making in various domains.

Month: December 2023

Project-1 Resubmission

Project-3

Dec 8,2023

Dec 4,2023

Dec 1,2023