Synthetic Dataset Example
A 3-year synthetic cash flow dataset with controlled patterns like linear trends and seasonal spikes for testing LSTM under controlled conditions.
Dataset Overview
For this example, we've created a synthetic dataset that simulates 3 years of daily cash flow data starting from January 1, 2022. The dataset includes controlled patterns such as:
- Linear income trends with gradual growth
- Weekly seasonality with higher income on weekends
- Monthly seasonality with peaks at the beginning of each month
- Quarterly seasonality with end-of-quarter spikes
- Annual seasonality with holiday season increases
- Random noise to simulate real-world variability
This controlled environment allows us to test the LSTM model's ability to learn and predict various patterns that commonly occur in financial data, while having ground truth knowledge of the underlying patterns.
Dataset Characteristics
Time Period
January 1, 2022 - December 31, 2024 (1095 days)
Features
Date, Day of Week, Income, Expenses, Net Cash Flow
Patterns
Linear trends, weekly, monthly, quarterly, and annual seasonality
Purpose
Testing LSTM performance under controlled conditions
Step-by-Step Process
Creating a synthetic dataset with controlled patterns including linear trends, seasonal factors, and random noise to simulate real-world cash flow data.
Preprocessing the data, engineering features, and training an LSTM model with self-attention and bidirectional layers to capture temporal patterns.
Evaluating the trained model on test data, calculating performance metrics, and visualizing predictions against actual values.
Expected Outcomes
With this synthetic dataset, we expect the LSTM model to:
- Learn the underlying linear trends in income and expenses
- Capture weekly seasonality patterns (weekend vs. weekday differences)
- Identify monthly and quarterly patterns
- Recognize annual seasonal effects
- Filter out random noise to focus on meaningful patterns
The controlled nature of this dataset allows us to evaluate exactly how well the model learns each type of pattern, providing insights into the strengths and limitations of LSTM networks for financial forecasting.