E-commerce Business Report

A practice example of a pyspark business report

Author

Dennis Feyerabend

Published

December 1, 2025

Abstract
The project analyzes the e-commerce data of a fictitious company to identify insights and key trends. The report is written in Quarto and uses a custom theme (Arrakis Night). The primary purpose of the report was to practice pyspark and business reporting.

1 E-Commerce Business Report

1.1 Data overview:

The data for this report was created with the script generate_bigdata.py and stored in the data folder. The dataset used is ecommerce_5m.csv, containing 5 million rows.


1.1.1 Column Overview

Column Description Values / Range
transaction_id Unique identifier for each transaction
customer_id Unique identifier for each customer
date Transaction date 2023-01-01 → 2024-12-30
product_category Product type Beauty, Books, Electronics, Fashion, Food, Home & Garden, Sports, Toys
product_price Product price in EUR ≥ 0
quantity Quantity of products per transaction ≥ 1
payment_method Payment method Apple Pay, Bank Transfer, Cash on Delivery, Credit Card, Klarna, PayPal
country Country of purchase Austria, Belgium, Denmark, Finland, France, Germany, Ireland, Netherlands, Poland, Spain, Switzerland
customer_age Customer age 18–74 years
total Total transaction amount in EUR ≥ 0

1.2 Goals of analyses

The report aims to answer the following questions:

  1. Which country is most important for business?
  2. Which products generate the most revenue?
  3. Which customers are most important and how are they characterized?

2 Revenue per country

2.1 Insights: Country revenue contribution over the last two years:

  • Germany is the clear core market, contributing 45% of total revenue over the last two years. This high concentration makes Germany the primary driver of overall business performance.
  • Austria (15%), Switzerland (12%), and the Netherlands (10%) form a stable second tier, accounting for ~37% combined. These markets provide meaningful diversification and reduce dependence on Germany.
  • Belgium, France, and Poland together contribute less than 15%, representing smaller but steady markets with limited impact on overall revenue trends.
  • The geographic revenue distribution is highly stable, suggesting a mature market structure with no major shifts in regional demand.
country order_volume total_revenue_mill avg_revenue revenue_percent
Germany 2250568 1,287.82 572.22 45.04
Austria 750533 428.72 571.21 14.99
Switzerland 599950 343.80 573.05 12.02
Netherlands 499471 285.42 571.44 9.98
Belgium 399549 228.03 570.71 7.98
France 300321 171.71 571.76 6.01
Poland 199608 113.76 569.93 3.98

2.2 Insights: Yearly country revenue:

  • There are no noticable changes in revenue contribution between countries for the years 2023 and 2024.

2.3 Insights: Monthly country revenue for 2024:

  • Revenue contribution between countries has been stable over the last year.

2.4 Key insight:

  • The company revenue is heavily dependent on the German market. Diversification is needed.

3 Revenue per Product Category

3.1 Insights: Product revenue contribution

  • Electronics dominate total revenue generation (53.5%) but expose the business to concentration risk. Any sector downturn in electronics would heavily impact total revenue.
  • Lower-performing categories represent only ~7% of revenue. These categories could be candidates for portfolio optimization, consolidation, or targeted growth strategies
product_category order_volume total_revenue_mill avg_revenue revenue_percent
Electronics 1248965 1,530.18 1,225.16 53.52
Home & Garden 750846 459.77 612.33 16.08
Fashion 999124 367.59 367.91 12.86
Sports 600309 299.19 498.39 10.46
Toys 300645 74.84 248.94 2.62
Beauty 400233 73.56 183.80 2.57
Books 499513 34.38 68.83 1.20
Food 200365 19.74 98.54 0.69

3.2 Insights: Product revenue contribution over the last two years (2023 + 2024):

  • There are no noticable changes revenue contribution between products between the years 2023 and 2024.

3.3 Monthly product revenue contribution for 2024:

  • Product category revenue shares are highly stable throughout the year, indicating predictable customer demand and low volatility across categories.

3.4 Key insight:

  • The company revenue is heavily dependent on Electronics. Diversification is recommended.

4 Top products per country

4.1 Insights: Top revenue generating products per country:

  • The top products are the same between countries
  • The most profitable products are Electronics, followed by Home & Garden articles and then Fasion

5 Customer analysis

5.1 Insights: Top customers:

  • Revenue contribution increases almost linearly across the entire customer base, indicating a steady relationship between customer count and revenue generated.
  • Revenue is broadly distributed rather than concentrated among a small set of high-value customers. No single customer group disproportionately drives total revenue.
  • No distinct “VIP customer” segment emerges. Instead, a larger upper segment of customers collectively represents the most valuable group, contributing meaningfully without extreme outliers.

6 Summary of Results

6.1 Key Insights

  • Revenue is concentrated in a small set of countries with Germany being the most important (followed by Austria, Switzerland and Netherlands).
  • A few product categories account for the majority of total sales.
  • Category and country contributions remain stable over time, indicating predictable customer demand.
  • The same products consistently appear as top revenue drivers within each country.
  • There is no distinct small group of “VP customers”, instead the majority of customers contribute meaningfully to total revenue.

7 Recommendations

  • Strengthen marketing and promotional strategies in the top-performing countries.
  • Evaluate opportunities to expand in mid-tier markets with growing potential.
  • Prioritize inventory planning for high-impact product categories and top-selling items.