Files
lc_stylist_agent/data_ingestion

Steps

  1. Prepare products-all.json and image_data (folder) using javascript to download. These files should be saved in ./data/BATCH_SOURCE which is a new folder. Give a new batch_source id to each new incoming data.
  2. Run process_item.py to categorize category, gender and occasions for each data. Output to ./data/{BATCH_SOURCE}/metadata_extraction.json. This should be running on H200 device.
  3. Organize all data and then embed them into db locally using run_ingestion.py

Raw Data Structure

## products-all.json
{
    "id": "BUL808",
    "name": "SARAH ZHUANG - 'Click & Link' diamond 18k gold earrings",
    "brand": "SARAH ZHUANG",
    "category": "Fine Jewellery And Watches",
    "subcategory": "General",
    "price": 17500,
    "currency": "HKD",
    "description": "Sarah Zhuang's Click & Link earrings embrace the allure of geometry. Forged into elegant rectangles with one side encrusted with diamonds, this gold pair will certainly elevate your cocktail ensembles.",
    "tags": [
    "sarah zhuang",
    "fine jewellery and watches",
    "in-stock",
    "new",
    "sarah",
    "zhuang",
    "'click",
    "link'",
    "diamond"
    ],
    "imageUrl": "https://media.lanecrawford.com/B/U/L/BUL808_in_xl.jpg",
    "url": "https://www.lanecrawford.com.hk/product/sarah-zhuang/-click-link-diamond-18k-gold-earrings/_/BUL808/product.lc?utm_medium=embed&utm_source=ai-recommended&utm_campaign=2025-christmas_lc_ai-recommended",
    "color": "YELLOW GOLD",
    "groupName": "Fine Jewellery",
    "deptName": "Women's  Fine Jewellery",
    "onlineBU": "Fine Jewellery",
    "stockAvailability": true
}

Example in metadata_extraction.json

"EOJ367": {
    "category": "shoes",
    "gender": "female",
    "applicable_occasions": [
        "Casual",
        "Outdoor",
        "Travel / Transit"
    ],
    "inappropriate_occasions": [
        "Formal",
        "Black Tie / White Tie",
        "Bridal / Wedding",
        "Business / workwear",
        "Cocktail / Semi-Formal"
    ]
}

Metadata in Vector Database

{
    'item_id': 'EOJ128',
    'category': 'sunglasses', 
    'gender': 'unisex', 
    'modality': 'image',
    'brand': 'CELINE',
    'color': 'BROWN', 
    'description': "Immerse yourself in the depth of classic style with CELINE\'s Tortoiseshell Logo Sunglasses. Featuring a rich, tortoiseshell acetate frame and adorned with the iconic CELINE logo in gold, these sunglasses are a testament to timeless elegance and luxury. Perfect for those who appreciate a sophisticated aesthetic, they offer optimal UV protection while ensuring you remain at the forefront of fashion.",
    'tags': 'celine,accessories,in-stock,new,maxi,triomphe,acetate,round', 
    'price': 4500, 
    'url': 'https://www.lanecrawford.com.hk/product/celine/maxi-triomphe-acetate-round-sunglasses/_/EOJ128/product.lc?utm_medium=embed&utm_source=ai-recommended&utm_campaign=2025-christmas_lc_ai-recommended',
    'batch_source': '2025_q4',
    'Outdoor': 0, 
    'Ski / Snow / Mountain': 0, 
    'Festival / Concert': 0, 
    'Activewear': 0, 
    'Casual': 1, 
    'Cocktail / Semi-Formal': -1, 
    'Formal': -1, 
    'Party / Clubbing': 0, 
    'Evening': 0, 
    'Travel / Transit': 0, 
    'Beach / Swim': 0, 
    'Garden Party / Daytime Event': 1, 
    'Black Tie / White Tie': -1, 
    'Resort': 1, 
    'Athleisure': 0, 
    'Business / workwear': -1, 
    'Bridal / Wedding': -1, 
}