Official AI Track modules, rules, marks criteria, penalties, and submission checklists.
Important: Any module with fewer than 10 registrations will be dropped.
To avoid confusion: AI Track work must be created live on event day at the venue using the organizer-approved platform and tools. Teams should bring laptops, chargers, and allowed notes, but pre-made final prompts, pre-built models, pre-written stories, or pre-completed reports are not accepted for scoring. Submissions are digital unless organizers explicitly announce a physical submission requirement for any round.
Prompt Engineering Competition · AI Track
Prompt Arena is a live prompt engineering battle where teams compete to produce accurate and creative outputs from a fixed AI model using prompt design only. No fine-tuning and no code changes are allowed. Teams receive identical base tasks and are judged on output quality and prompt craftsmanship.
This module tests role assignment, few-shot prompting, output constraints, reasoning strategy, and adversarial robustness.
| Detail | Requirement |
|---|---|
| Team Size | 2 to 3 members |
| Eligibility | Open to CS, DS, and SE students |
| Team Lead | 1 member per team (primary communicator) |
| Track | Artificial Intelligence |
| Round | Focus | Challenge |
|---|---|---|
| Round 1 | Precision | Extract a specific factual answer with minimal hallucination |
| Round 2 | Creativity | Generate a culturally relevant creative piece under strict constraints |
| Round 3 | Adversarial | Prompt the model to refuse an embedded trick or trap in the task description |
| Criteria | Marks | What Judges Look For |
|---|---|---|
| Output Quality and Accuracy | 25 | Whether output solves the task correctly |
| Prompt Ingenuity and Design | 20 | Structured, clever, and non-obvious prompt strategy |
| Adversarial Robustness (Round 3) | 20 | Ability to survive trap instructions without failing |
| Written Rationale | 15 | Clear and convincing explanation of prompt logic |
| Brevity and Efficiency | 10 | High quality with fewer prompt tokens |
| Consistency Across Rounds | 10 | Stable quality over precision, creativity, and adversarial tasks |
| Total | 100 |
AI Fact-Checking and Red-Teaming Challenge · AI Track
Hallucination Hunt challenges teams to identify factual hallucinations in AI-generated documents, including fabricated citations, invented statistics, and subtle logic errors. Teams must find, classify, and correct each issue.
This module evaluates critical thinking, fact-checking discipline, and real-world AI red-teaming ability.
| Detail | Requirement |
|---|---|
| Team Size | 2 to 3 members |
| Eligibility | Open to CS, DS, SE, and related departments |
| Team Lead | 1 member per team |
| Track | Artificial Intelligence |
Each team receives the same document pack containing 3 domain documents (science, history, technology), each with 5 to 12 hidden hallucinations.
| Tier | Difficulty | Example |
|---|---|---|
| Tier 1 | Obvious | Clearly fabricated facts such as wrong dates or impossible numbers |
| Tier 2 | Subtle | Plausible-looking errors such as near-correct names and figures |
| Tier 3 | Adversarial | Logically consistent but factually false reasoning chains |
| Criteria | Marks | What Judges Look For |
|---|---|---|
| True Positives Found | 30 | Correctly detected hallucinations (scaled) |
| Correct Classification | 20 | Accurate tagging of error type |
| Quality of Corrections | 20 | Factually correct and concise fixes |
| False Positive Penalty | -5 | Deducted for incorrectly flagging valid statements |
| Confidence Score Calibration | 15 | Well-calibrated uncertainty estimates |
| Report Clarity and Structure | 15 | Professional, readable reporting format |
| Total | 100 |
Human-AI Collaborative Narrative Challenge · AI Track
AI Storyteller is a collaborative narrative challenge where teams co-author a short story with an AI language model. Teams write alternating human and AI paragraphs, directing the model toward a meaningful story rooted in KPK culture, folklore, or contemporary life.
The best score comes from strong storytelling, authentic cultural references, and clear human direction over AI output.
| Detail | Requirement |
|---|---|
| Team Size | 2 to 4 members |
| Eligibility | Open to all departments |
| Team Lead | 1 member (story director controlling prompt decisions) |
| Track | Artificial Intelligence |
Each team gets the same one-sentence KPK-based story seed and then alternates human and AI paragraphs until story completion.
| Requirement | Rule |
|---|---|
| Story Length | 10 to 14 paragraphs total (alternating Human / AI) |
| Human Paragraphs | Minimum 5 (clearly marked) |
| AI Paragraphs | Minimum 4 (raw output submitted with edited version) |
| Cultural Anchor | At least 2 authentic KPK cultural elements |
| Live Pitch | 1-minute audience pitch at judging |
| Criteria | Marks | What Judges Look For |
|---|---|---|
| Narrative Quality and Engagement | 25 | Compelling and coherent story flow |
| Cultural Authenticity | 20 | Genuine, respectful KPK references |
| Human Authorship and Direction | 20 | Meaningful human steering of AI output |
| AI Collaboration Quality | 15 | Prompt quality and seamless human-AI transitions |
| Audience Vote | 10 | Most engaging live audience choice |
| Story Direction Note | 10 | Clear reflection on creative process and prompting decisions |
| Total | 100 |
Live ML Model Building and Comparison Showdown · AI Track
Model Face-Off is a speed-build ML competition where teams receive the same tabular dataset on event day and must train, compare, and defend multiple models live. Teams are scored on both model performance and their explanation of why the winning model performed best.
The dataset may include missing values, class imbalance, and noisy features. Teams must handle these issues within the competition window.
| Detail | Requirement |
|---|---|
| Team Size | 3 to 4 members |
| Eligibility | Open to DS and CS students |
| Team Lead | 1 member per team |
| Track | Artificial Intelligence |
| Phase | Focus |
|---|---|
| Phase 1 - EDA | Dataset exploration and understanding (30 min) |
| Phase 2 - Build | Train minimum 3 different ML models |
| Phase 3 - Compare | Create Model Scorecard for head-to-head evaluation |
| Phase 4 - Defend | Live 3-minute explanation and judge Q and A |
Python is required. Allowed libraries include pandas, NumPy, scikit-learn, matplotlib, seaborn, SHAP, XGBoost, LightGBM, and imbalanced-learn. Deep learning frameworks are allowed but not favored. AutoML tools are strictly prohibited.
| Criteria | Marks | What Judges Look For |
|---|---|---|
| Model Performance (Best Model) | 25 | Best test-set metrics compared with other teams |
| Model Diversity and Comparison | 20 | Breadth and quality of Model Scorecard |
| Explainability Output | 20 | How clearly team explains model behavior |
| Live Defence | 15 | Confidence, depth, and clarity in Q and A |
| Code Quality and EDA | 10 | Readable code and solid preprocessing |
| Handling Data Issues | 10 | Robust treatment of noise, missing values, and imbalance |
| Total | 100 |