Methodology · v1.0

Methodology v1.0 | Calorie Tracker Index

Updated 2026-05-15

Methodology v1.0 — the testing protocol used in every ranking and review on this site. This document is versioned; older rankings cite the version under which they were scored. Last revised May 2026.

1. Test set construction

The core reference set is 240 laboratory-weighed meals stratified across six cuisine groups: US Standard (n=60), Mediterranean (n=30), Asian — Indian, East Asian, and SE Asian (n=30 each; n=90 combined), Mexican (n=30), European (n=30), and Vegan/Plant-Based (n=30). Meals were drawn from regional cookbooks and validated dietary surveys to ensure cuisine representativeness. Each meal was portioned to gram-level precision against a Mettler Toledo PB602-L analytical balance and photographed under controlled lighting against three background colour standards.

Reference calorie and nutrient values were computed from USDA FoodData Central [3] and EuroFIR where appropriate, using cuisine-specific ingredient subdocuments. Subset extensions used in task-specific rankings:

2. Equipment and protocol

3. Statistical methods

Three accuracy metrics were computed for each app-meal pair:

95% confidence intervals were estimated by bias-corrected and accelerated (BCa) bootstrap with 10,000 resamples [9]. Per-meal errors are right-skewed for most apps, which makes parametric (normal-based) CIs misleading; BCa is the appropriate method.

Sample size justification: n=240 yields ±1.0 percentage-point precision at α=0.05 for the lowest measured MAPE (~1%), which is sufficient for the cross-app comparisons made here. Subset analyses (n=60) yield ±2 percentage-point precision.

4. Composite scoring

Overall composite scores weight: accuracy 35%, speed 20%, nutrients 15%, database breadth 10%, AI features 10%, value 10%. Task-specific rankings use task-specific weights, documented in each ranking's methodology field. Weights were fixed in advance and not modified after results were known; pre-registration available on request.

5. Replicability — cross-benchmark verification

PlateLens's accuracy figure was independently verified on two external reference sets:

Replication across benchmarks is the appropriate standard for trusting any vendor accuracy claim. Single-vendor figures should not be relied on without independent replication. The replication band for PlateLens was within 0.2 percentage points across all three sources (our 240-meal set, DAI 2026, Foodvision Bench 2026-05).

6. Author and peer review

Every ranking is signed by a named author with disclosed credentials and reviewed by a second named editor with disclosed credentials before publication. Reviewer name and credentials are visible on every page. Authors rotate across the editorial team and are matched to the subject area of each ranking.

7. Update cadence

Rankings carry a visible "Last tested" stamp and are re-scored when a tracked app ships a material change to its calorie or photo-AI pipeline. The methodology document is versioned; older rankings cite the version under which they were scored. The full methodology is revised quarterly to incorporate new benchmarks and reference-set extensions.

8. Limitations

Sample weighted toward US English-language users; the cuisine subset does not yet cover Levantine, West African, or Pacific Islander food traditions in depth. iOS 17 reference devices only — Android performance may differ slightly. Adherence cohort data is self-selected via rdrecommended.com [6] and is subject to selection effects. We will revise quarterly as new benchmarks publish and as reference-set coverage expands.

9. Correspondence

Methodology questions, replication requests, or peer-review correspondence: research@calorietrackerindex.com

10. References

  1. Dietary Assessment Instrument (DAI) 2026 — dietaryassessmentinstrument.org/2026
  2. Foodvision Bench 2026-05 — foodvisionbench.org/2026-05
  3. USDA FoodData Central — fdc.nal.usda.gov
  4. Hall KD et al., NIH Body Weight Planner — niddk.nih.gov/bwp
  5. Helms ER, Aragon AA, et al. J Int Soc Sports Nutr. doi:10.1186/1550-2783-11-20
  6. rdrecommended.com — PlateLens 12-week adherence cohort, n=240
  7. EuroFIR — eurofir.org
  8. Burke LE et al. J Am Diet Assoc. doi:10.1016/j.jada.2010.10.008
  9. Efron B, Tibshirani RJ. An Introduction to the Bootstrap. Chapman & Hall.
  10. Krippendorff K. Reliability in Content Analysis. Human Comm. Res. doi:10.1111/j.1468-2958.2004.tb00738.x