TechMesh Data Science Rule Book

Important: Any module with fewer than 10 registrations will be dropped.

Important Participation Summary

To avoid confusion: all Data Science Track work must be completed live on event day at the venue. Teams should bring laptops, chargers, and allowed datasets or notes as per module rules, but pre-built final solutions, pre-written code notebooks, or pre-generated outputs are not accepted for scoring. Submissions are digital unless organizers explicitly announce any physical submission requirement.

MODULE 1: DATA DOPPELGANGER

Algorithm Competition | Data Science Track

What Is This Module?

Data Doppelganger is a live coding competition where teams build a personality-matching algorithm entirely on the event day. Each team is responsible for sourcing their own personality dataset beforehand and must arrive on the event day ready to use it. Teams develop an algorithm that finds data twins: two people whose personality profiles most closely match each other.

The algorithm must take personality inputs like tea vs. coffee preference, morning vs. night person, introvert vs. extrovert, and similar attributes, then output a Personality Card showing the two most similar individuals and why they match.

This is a one-day module, held on Day [X] of the 3-Day Challenge.

Team Composition

Detail	Requirement
Team Size	3 to 4 members
Team Lead	1 member per team (handles communication with organizers)
Department	Data Science / Computer Science

Event Day Schedule

Time	Activity
Opening	Competition officially begins, teams set up their workspace
Coding Window	Teams build their algorithm (time limit: as set by organizers)
Submission	Code and output submitted to judges before deadline
Evaluation	Judges review submissions and score against criteria
Results	Winners announced at end of day

Dataset Guidelines

Each team is responsible for finding and bringing their own personality dataset. The dataset must be based on personality-related attributes such as:

Tea or coffee preference
Morning person or night person
Introvert or extrovert
Preferred study environment (silent or with noise)
Risk-taking tendency
Decision-making style (logical vs. emotional)
Social energy level
Creative vs. analytical thinking style

The dataset must contain a minimum of 100 entries and cover at least 8 distinct personality attributes. Teams are encouraged to explore public sources like Kaggle MBTI datasets and Open Psychometrics raw data. Teams may also collect their own survey data if minimum requirements are met. Fabricated or AI-generated datasets are strictly prohibited and result in disqualification.

What Teams Must Build

A working algorithm that processes personality data and identifies the two most similar individuals
A Personality Card output showing the matched pair, top shared traits, and a similarity score or percentage
Clean, readable, well-commented code
A brief text explanation in code comments or a short README describing the approach

Rules and Regulations

General Rules

Python is the required programming language. Standard libraries such as pandas, NumPy, scikit-learn, and matplotlib are permitted.
Use of pre-trained large language models (for example, ChatGPT or Gemini) as the core matching engine is strictly prohibited. The algorithm must be the team's own logic.
Internet access is allowed only for referencing documentation and libraries, not for copying algorithmic solutions.
Teams can use AI tools for help, but direct copy-paste of full code or idea will lead to 0 marks.
All team members must be present throughout the event day. Absence of more than one member may result in score deductions.
Teams are not permitted to share code, datasets, or approaches with other teams during the competition.
Any form of plagiarism, including copying from online repositories, results in immediate disqualification.

Submission Rules

Teams must submit code and Personality Card output before the organizer deadline.
Late submissions are penalized as outlined in the penalties section.
Submitted code must be original work produced on the event day.

Accuracy Testing - Live Audience Evaluation

Judges will select 3 to 4 volunteers from the audience. Each volunteer fills the personality form on the spot. The team runs the algorithm live on those responses, and a Personality Card is displayed showing whether a data twin exists and how strong the match is.

Judges score real-time performance on unseen data. A strong match with clear and visual explanation scores higher. Teams must run smoothly without prior knowledge of volunteer responses.

Marks Criteria

Criteria	Marks	What Judges Look For
Uniqueness and Novelty of Approach	20	Original matching logic; creative hybrid methods rewarded
Live Accuracy on Audience Data	20	How well the algorithm performs on real volunteer inputs selected by the judge
Code Quality and Structure	15	Clean, readable, well-commented code with logical flow and no redundancy
Visual Design of Personality Card	15	How informative, clear, and visually appealing the output card is
Creativity of Output	10	Engagement factor of the card and whether it tells a story about the match
Clarity of Approach (Comments/README)	10	How well the team explains logic and methodology in writing
Overall Presentation of Output	10	Whether final output feels polished and complete
Total	100

Penalties

Violation	Penalty
Late submission	-5 marks per 15 minutes of delay
Use of a pre-trained AI model as the core engine	Immediate disqualification
Pre-written algorithm brought to the event	Immediate disqualification
Code copied from online repositories or other teams	Immediate disqualification
Fabricated or AI-generated dataset	Immediate disqualification
Absence of more than one team member	-10 marks
Failure to submit a Personality Card as output	-15 marks
Unsportsmanlike conduct or interference with other teams	-10 marks and formal warning

Submission Checklist

Complete source code (Python file or notebook)
Personality Card output (screenshot, printed card, or displayed output)
Brief written explanation of the algorithm (in comments or a short README)

MODULE 2: CITY WHISPERER

Algorithm Competition | Data Science Track

What Is This Module?

City Whisperer is a live coding competition inspired by this idea: can an algorithm guess where someone is from based on the way they answer a few questions? Teams build a city prediction model that takes personality and lifestyle responses and predicts the city in Khyber Pakhtunkhwa they most likely belong to.

On event day, judges select a random student from the audience. The student answers a short set of live questions. The team algorithm predicts the student's home city, for example Peshawar, Swat, D.I. Khan, Mardan, Abbottabad, or another city in KPK. Closer prediction means better score.

This is a one-day module, held on Day [X] of the 3-Day Challenge.

Team Composition

Detail	Requirement
Team Size	3 to 4 members
Eligibility	Open to DS and CS students
Team Lead	1 member per team (handles communication with organizers)
Department	Data Science / Computer Science

Event Day Schedule

Time	Activity
Opening	Competition officially begins, teams set up workspace
Coding Window	Teams build algorithm (time limit set by organizers)
Submission	Code and output submitted to judges before deadline
Evaluation	Judges review submissions and score against criteria
Results	Winners announced at end of day

Dataset Guidelines

Each team must source a dataset linking lifestyle, cultural, and behavioral attributes to cities within Khyber Pakhtunkhwa. Suggested attributes include:

Language or dialect preference (Pashto, Hindko, Saraiki, etc.)
Food preferences (chapli kabab, sajji, namkeen, etc.)
Typical daily routine and sleep patterns
Preferred mode of transport
Urban vs. rural lifestyle leaning
Climate preference (cold hills vs. hot plains)
Festivals and cultural events familiarity
Social gathering and hospitality norms

The dataset must cover at least 15 distinct cities within KPK and contain at least 100 entries. Teams may gather their own survey data or use public sources (for example Kaggle and data.gov.pk). Fabricated or AI-generated datasets are strictly prohibited and lead to disqualification.

What Teams Must Build

A working classification algorithm that predicts a person's most likely home city from lifestyle and cultural responses
A City Card output with predicted city, confidence/probability score, and top reasons behind prediction
A short input form or prompt list with at least 8 questions
Clean, readable, well-commented code
A brief explanation in comments or README about approach and model choice

Accuracy Testing - Live Audience Evaluation

Judges will select 3 to 4 volunteers at random. Each volunteer answers the team's input questions live. The algorithm then outputs a City Card with predicted city and confidence score. Volunteers confirm prediction correctness, and judges evaluate live accuracy and confidence on unseen audience data.

Rules and Regulations

General Rules

Python is required. Standard libraries like pandas, NumPy, scikit-learn, and matplotlib are permitted.
Pre-trained large language models cannot be used as the core prediction engine.
Internet access is allowed for documentation and libraries only, not for copying algorithmic solutions.
Teams can use AI tools for help, but direct copy-paste of full code or idea leads to 0 marks.
All team members must remain present throughout event day; absence of more than one member may reduce score.
Teams may not share code, datasets, or approaches with other teams during competition.
Any plagiarism, including copying from online repositories, results in immediate disqualification.

Submission Rules

Teams must submit code and City Card output before organizer deadline.
Late submissions are penalized per penalty policy.
Submitted code must be original work produced on event day.

Marks Criteria

Criteria	Marks	What Judges Look For
Uniqueness and Novelty of Approach	20	Original classification logic and creative feature engineering/model choices
Live Accuracy on Audience Data	20	How correctly the model predicts city for real volunteers
Code Quality and Structure	15	Clean, readable, commented, and logically organized code
Visual Design of City Card	15	Clear, informative, and visually appealing prediction output
Creativity of Output	10	How engaging and compelling the explanation of prediction is
Clarity of Approach (Comments/README)	10	How well logic, features, and model choice are explained
Overall Presentation of Output	10	Whether final output feels polished and complete
Total	100

Penalties

Violation	Penalty
Late submission	-5 marks per 15 minutes delay
Use of a pre-trained AI model as core engine	Immediate disqualification
Pre-written algorithm brought to event	Immediate disqualification
Code copied from repositories or other teams	Immediate disqualification
Fabricated or AI-generated dataset	Immediate disqualification
Absence of more than one team member	-10 marks
Failure to submit a City Card as output	-15 marks
Unsportsmanlike conduct or interference with other teams	-10 marks and formal warning

Submission Checklist

Complete source code (Python file or notebook)
City Card output (screenshot, printed card, or displayed output)
Input question form used for audience volunteers
Brief written explanation of algorithm (comments or short README)

Module designed by: Head of Data Science | Computer Cell Society, UET Peshawar

COMPUTER CELL SOCIETY
University of Engineering and Technology, Peshawar
TechMesh 2026 - Official Rule Book

Table of Contents

Important Participation Summary

MODULE 1: DATA DOPPELGANGER

What Is This Module?

Team Composition

Event Day Schedule

Dataset Guidelines

What Teams Must Build

Rules and Regulations

Accuracy Testing - Live Audience Evaluation

Marks Criteria

Penalties

Submission Checklist

MODULE 2: CITY WHISPERER

What Is This Module?

Team Composition

Event Day Schedule

Dataset Guidelines

What Teams Must Build

Accuracy Testing - Live Audience Evaluation

Rules and Regulations

Marks Criteria

Penalties

Submission Checklist