93 lines
3.4 KiB
Markdown
93 lines
3.4 KiB
Markdown
## Steps
|
|
1. Prepare products-all.json and image_data (folder) using javascript to download. These files should be saved in `./data/BATCH_SOURCE` which is a new folder. Give a new batch_source id to each new incoming data.
|
|
1. Run `process_item.py` to categorize category, gender and occasions for each data. Output to `./data/{BATCH_SOURCE}/metadata_extraction.json`. This should be running on H200 device.
|
|
3. Organize all data and then embed them into db locally using `run_ingestion.py`
|
|
|
|
## Raw Data Structure
|
|
```json
|
|
## products-all.json
|
|
{
|
|
"id": "BUL808",
|
|
"name": "SARAH ZHUANG - 'Click & Link' diamond 18k gold earrings",
|
|
"brand": "SARAH ZHUANG",
|
|
"category": "Fine Jewellery And Watches",
|
|
"subcategory": "General",
|
|
"price": 17500,
|
|
"currency": "HKD",
|
|
"description": "Sarah Zhuang's Click & Link earrings embrace the allure of geometry. Forged into elegant rectangles with one side encrusted with diamonds, this gold pair will certainly elevate your cocktail ensembles.",
|
|
"tags": [
|
|
"sarah zhuang",
|
|
"fine jewellery and watches",
|
|
"in-stock",
|
|
"new",
|
|
"sarah",
|
|
"zhuang",
|
|
"'click",
|
|
"link'",
|
|
"diamond"
|
|
],
|
|
"imageUrl": "https://media.lanecrawford.com/B/U/L/BUL808_in_xl.jpg",
|
|
"url": "https://www.lanecrawford.com.hk/product/sarah-zhuang/-click-link-diamond-18k-gold-earrings/_/BUL808/product.lc?utm_medium=embed&utm_source=ai-recommended&utm_campaign=2025-christmas_lc_ai-recommended",
|
|
"color": "YELLOW GOLD",
|
|
"groupName": "Fine Jewellery",
|
|
"deptName": "Women's Fine Jewellery",
|
|
"onlineBU": "Fine Jewellery",
|
|
"stockAvailability": true
|
|
}
|
|
```
|
|
|
|
|
|
## Example in `metadata_extraction.json`
|
|
```json
|
|
"EOJ367": {
|
|
"subcategory": "necklaces",
|
|
"gender": "female",
|
|
"applicable_occasions": [
|
|
"Casual",
|
|
"Outdoor",
|
|
"Travel / Transit"
|
|
],
|
|
"inappropriate_occasions": [
|
|
"Formal",
|
|
"Black Tie / White Tie",
|
|
"Bridal / Wedding",
|
|
"Business / workwear",
|
|
"Cocktail / Semi-Formal"
|
|
]
|
|
}
|
|
```
|
|
|
|
## Metadata in Vector Database
|
|
```json
|
|
{
|
|
"item_id": "EOJ128",
|
|
"category": "accessories",
|
|
"subcategory": "eyewear",
|
|
"gender": "unisex",
|
|
"modality": "image",
|
|
"brand": "CELINE",
|
|
"color": "BROWN",
|
|
"description": "Immerse yourself in the depth of classic style with CELINE's Tortoiseshell Logo Sunglasses. Featuring a rich, tortoiseshell acetate frame and adorned with the iconic CELINE logo in gold, these sunglasses are a testament to timeless elegance and luxury. Perfect for those who appreciate a sophisticated aesthetic, they offer optimal UV protection while ensuring you remain at the forefront of fashion.",
|
|
"tags": "celine,accessories,in-stock,new,maxi,triomphe,acetate,round",
|
|
"price": 4500,
|
|
"url": "https://www.lanecrawford.com.hk/product/celine/maxi-triomphe-acetate-round-sunglasses/_/EOJ128/product.lc?utm_medium=embed&utm_source=ai-recommended&utm_campaign=2025-christmas_lc_ai-recommended",
|
|
"batch_source": "2025_q4",
|
|
"Outdoor": 0,
|
|
"Ski / Snow / Mountain": 0,
|
|
"Festival / Concert": 0,
|
|
"Activewear": 0,
|
|
"Casual": 1,
|
|
"Cocktail / Semi-Formal": -1,
|
|
"Formal": -1,
|
|
"Party / Clubbing": 0,
|
|
"Evening": 0,
|
|
"Travel / Transit": 0,
|
|
"Beach / Swim": 0,
|
|
"Garden Party / Daytime Event": 1,
|
|
"Black Tie / White Tie": -1,
|
|
"Resort": 1,
|
|
"Athleisure": 0,
|
|
"Business / workwear": -1,
|
|
"Bridal / Wedding": -1,
|
|
}
|
|
``` |