| Project ID | Project Title | PI and Researchers | Repository | Description |
|---|---|---|---|---|
| #23S-07C | A Comparative Evaluation of Deep Learning Models for Speech Enhancement in Real-world Noisy Environments |
|
GitHub Code
(Public) |
The "DL_SpeechEnhancementToolkit" is a collection of deep learning models—including Wave‑U‑Net, CMGAN, and UNet for enhancing speech quality under noisy real-world conditions. It provides end-to-end code for data preprocessing, model training, evaluation (including SNR, PESQ, and speaker-recognition metrics). |
| #22S-01M | On the Capacity and Uniqueness of Synthetic Face Images |
|
GitHub Code (Public) |
Code to reproduce all the results of the publication resulting from this project. |
| #22-01J-SP | Fully Homomorphic Encryption in Biometrics |
|
GitHub Code (Public) |
Code to reproduce the results of the paper published as part of this project. |
| #24S-04W | Deep-Learning-Based Generation of Synthetic Contactless Fingerphotos |
|
GitHub Code (Private) |
This project explores the synthetic generation of contactless fingerphotos using two deep learning approaches: image translation and sample synthesis. The repository contains Python-based implementations of CycleGAN and a Diffusion model, each with instructions for separate execution. CycleGAN was trained on a paired dataset of contact and contactless fingerprint images, demonstrating image translation capabilities. The Diffusion model focuses on synthetic generation, trained on fingerphotos collected with modern mobile devices. |
| #23S-05W | Performance Evaluation of Cross-Spectral Iris Matching: Visible vs NIR |
|
GitHub Code (Public) |
The goal of this work is to develop a method for iris translation between the NIR and VIS spectra using Generative Adversarial Networks (GANs). Image enhancement techniques are applied in the cropped and normalized iris domains, and a classifier component is incorporated into the GAN to preserve identity information during image translation. The system simultaneously synthesizes translations in both directions (NIR→VIS and VIS→NIR) during training. |
| #23S-08CB | Performance Benchmark: Ear-only vs. Ear+Face Fusion Biometrics for Adult and Children |
|
GitHub Code (Private) |
This repository implements a custom-trained YOLOv5 pipeline for detecting and cropping ear regions from
profile face images.
Using 100 manually annotated samples from the Clarkson Child Dataset, the YOLOv5 model was fine-tuned to
accurately localize
ears in side-view facial imagery. The ear_detection.py script loads the trained
model.pt and processes each input
image to generate high-quality cropped ear images for downstream analysis. It supports batch inference,
configurable paths,
and confidence-based bounding box selection. The tool is designed for biometric research, medical
preprocessing, or any task
requiring precise ear localization. Full setup instructions and dependencies are included for easy
integration and reproducibility.
|
| #23S-08CB | Performance Benchmark: Ear-only vs. Ear+Face Fusion Biometrics for Adult and Children |
|
GitHub Code (Public) |
This repository provides a complete pipeline for ear normalization and flattening based on landmark
detection and
geometric alignment. Starting with ear crops detected using a custom YOLOv5 model, the pipeline
localizes key
landmarks using a pretrained TensorFlow .pb model (Hansley et al., 2018). It then applies
geometric transformations
to normalize ear orientation and shape across samples. Finally, PCA-based flattening is performed to
generate 2D
representations suitable for biometric analysis. The pipeline supports both single-image
(demo.py) and batch
processing (main.py) modes, making it adaptable for research and deployment. This
implementation closely
follows methods described in the literature for ear alignment and template generation.
|
| #24S | Biometric Aging in Children — Phase V |
|
GitHub Code (Private) |
This GitHub repository provides a MATLAB-based pipeline for age-adaptive fingerprint scaling. A regression script fits polynomial models to Preciozzi’s published scale factors (ages 1, 2, 5–9 yrs) and adulthood, then predicts missing values for ages 3, 4, and 10–17 based on the best-fitting model. A rescaling function applies each subject’s age-specific scale factor via bicubic interpolation to normalize ridge spacing to an adult reference (~9 px @ 500 ppi). Finally, a batch script reads enrollment ages from Excel, processes all images in a directory tree, and outputs scaled prints in a parallel folder structure—ensuring compatibility with commercial fingerprint matchers. |
| #23F-02W | Improving Performance of Facial Recognition via Multi-Model Algorithmic Ensemble |
|
GitHub Code (Public) |
This repository contains code for an ensemble knowledge distillation project, where knowledge from multiple teacher models is distilled into a single student network to improve the overall performance of facial recognition systems. |
| #24S-01W | A Facial Image Quality Toolbox Based on ISO/IEC 29794-5 Specification |
|
GitHub Code (Public) |
This repository contains code for training a model that produces a unified score to assess face image quality, integrating multiple quality factors into a single metric. The implementation aligns with the ISO/IEC 29794-5 specification for biometric image quality evaluation and supports extensibility for research and benchmarking across different datasets. |
| #24S-04B | Investigating Molecular Fingerprints of Human Tissue using Multi-spectral Photoacoustic Imaging |
|
No code for this project. | This project involves a preliminary investigation into the use of multi-spectral photoacoustic imaging to identify molecular fingerprints of human tissue. The study explores feasibility and imaging methodology, but no software or code artifacts are currently associated with this project. |
| #21F-01M | diffDeMorph: Extending Reference-Free Demorphing to Unseen Faces |
|
GitHub Code (Public) |
diffDeMorph introduces a diffusion-based reference-free demorphing approach that separates individual faces from composite morphs without requiring reference images. It generalizes across diverse morphing techniques and face styles, outperforming prior methods by over 59% under a unified training setup. Trained on synthetic morphs and evaluated on real datasets, it demonstrates strong generalization and visual fidelity across six benchmarks. |
| #21F-01M | Facial Demorphing from a Single Morph Using a Latent Conditional GAN |
|
GitHub Code (Public) |
This work presents a latent conditional GAN-based framework for reference-free facial
demorphing from a single morph image.
The method addresses morph replication and generalization issues by decomposing morphs in latent space,
enabling recovery of constituent faces created using unseen morphing techniques and real images.
Trained on synthetic morphs and tested on real data, it significantly outperforms existing methods in
visual fidelity and robustness. IJCB-2025 IAPR Best Student Paper Award |
| #23F-01J-SP | TOWARDS THE CREATION OF A LARGE DATASET OF HIGH-QUALITY FACE MORPHS - PHASE III |
|
GitHub Code (Public) |
|
| #22F-01C | Android App for Biometric Visible Light Iris Image Capture |
|
GitHub Code (Public) |
This Android application enables biometric iris image capture in the visible light spectrum using advanced on-device processing and quality control mechanisms. It integrates real-time MediaPipe face mesh detection for precise eye localization and leverages the Camera2 API for accurate autofocus and region-of-interest targeting. Built-in OpenCV routines perform image quality analysis to ensure dataset-grade consistency, while compliance with ISO/IEC 29794-6 standards guarantees high-quality iris captures. The app also supports flashlight control for uniform eye illumination, automated file naming for streamlined data management, and structured dataset generation. |
| Dataset Title | PI and Researchers | Repository | Description |
|---|---|---|---|
| Development of Extended-Length Audio Dataset for Advanced Deepfake Synthesis and Detection |
|
GitHub Page |
Participant count: 36 speakers (each ~45 minutes of read-aloud natural speech in controlled sessions). Recording setup: five high-quality microphones per session (parallel channels). Synthetic speech: matched synthetic voices for 20 subjects using a mix of open-source and commercial systems (e.g., Tortoise TTS, ElevenLabs), enabling paired natural ↔ synthetic comparisons. Format: WAV audio files (original sample rates preserved). Metadata & privacy: demographics and capture-condition metadata provided; personally identifying information is anonymized. Access & hosting: public release via IEEE DataPort (Standard Dataset; access typically requires an IEEE DataPort subscription). |
| Clarkson University :Multi-modal Longitudinal Children Biometric Dataset |
|
GitHub
Page |
This longitudinal biometric dataset was collected from six different modalities at approximately six-month intervals over a span of 9.5 years at Potsdam Elementary and Potsdam High School. The collection, funded by the Center for Identification Technology Research (CITeR) and the National Science Foundation, includes: Fingerprints, Footprints, Face, Ear, NIR Iris, and Voice Recordings. The dataset aims to support research on biometric aging in children and enhance biometric recognition technologies for younger populations. |
| Clarkson University: Multimodal Longitudinal Infant–Toddler Biometric Dataset |
|
GitHub
Page |
This longitudinal biometric dataset was collected from six different modalities at approximately 15-day to 3-month intervals over a span of two years at the Canton Pediatric Office and Clarkson University. The collection, funded by the Center for Identification Technology Research (CITeR) and the National Science Foundation, includes: Fingerprints, Footprints, Face, Ear, NIR Iris, and Voice Recordings. The dataset supports research in early-age biometric development and recognition performance in infants and toddlers. |
| IBM-UB Online and Offline Multilingual Handwriting Dataset |
|
GitHub Page |
The IBM-UB dataset is a bilingual and bi-modal handwriting corpus containing both online (stylus-captured) and offline (scanned) handwritten documents in multiple languages. It includes free-form text, structured forms, isolated words, and symbols. This dataset supports multilingual OCR and information retrieval research. |
| UB RidgeBase Fingerprint Dataset |
|
RidgeBase Dataset Page |
RidgeBase is a large-scale benchmark dataset for advancing research in contactless fingerprint
recognition
using smartphone cameras. It contains over 15,000 contactless and contact-based fingerprint image pairs
collected from
88 individuals under diverse conditions. The dataset supports evaluation of multiple matching scenarios:
|
| GestSpoof: Gesture-Based Spatio-Temporal Representation Learning for Robust Fingerprint Presentation Attack Detection |
|
Dataset Page |
The GestSpoof dataset contains both real and spoof fingerprint samples collected from approximately 23 subjects using three spoof materials (bodydouble_alja, ecoflex_alja, and gelatin_bodydouble). Fingers include index, middle, ring, and little, each captured under five motion gestures (Vertical, Horizontal, Diag1, Diag2, and Twist). This dataset enables spatio-temporal modeling for motion-based fingerprint presentation attack detection by capturing differences in elastic deformation between authentic and synthetic fingerprints. |
| GestSpoof: Gesture-Based Spatio-Temporal Representation Learning for Robust Fingerprint Presentation Attack Detection |
|
Dataset Page |
The GestSpoof dataset contains both real and spoof fingerprint samples collected from approximately 23 subjects using three spoof materials (bodydouble_alja, ecoflex_alja, and gelatin_bodydouble). Fingers include index, middle, ring, and little, each captured under five motion gestures (Vertical, Horizontal, Diag1, Diag2, and Twist). This dataset enables spatio-temporal modeling for motion-based fingerprint presentation attack detection by capturing differences in elastic deformation between authentic and synthetic fingerprints. |
| Multimodal Biometric Dataset Collection |
|
GitHub Page |
The Multimodal Biometric Dataset Collection is a multisite database developed by West Virginia University and Clarkson University for research under the CITeR Database Release Agreement. It includes iris, face, face video, voice, fingerprint, hand geometry, and palmprint data for over 500 subjects collected across multiple visits. The dataset supports multimodal biometric fusion and longitudinal analysis, and includes releases: WVU Release 1, WVU Release 2, and CU Release 2. |
| Quality—Face/Iris Research Ensemble (Q-FIRE) |
|
GitHub Page |
The Q-FIRE dataset, developed at Clarkson University, contains iris and face image sequences collected at varying distances and quality levels. It was funded by the Department of Homeland Security (DHS) Science and Technology Directorate in cooperation with the National Science Foundation (NSF). The dataset supports research on multi-biometric fusion and image quality assessment, and was designed for use in IREX II evaluations. |
| Clarkson University: Face–Iris Research Ensemble (Q-FIRE II) |
|
GitHub Page |
The Face–Iris Research Ensemble (Q-FIRE II) dataset, collected at Clarkson University, contains visible, NIR, and LWIR image sequences of faces and irises at varying distances and qualities, along with voice recordings. Funded by the Center for Identification Technology Research (CITeR) and the National Science Foundation, this multimodal dataset supports research in multi-spectral and multi-biometric fusion, quality assessment, and human identification under diverse conditions. |
| Keystroke Dataset Collection |
|
GitHub Page |
The Keystroke Dataset was collected at Clarkson University from 39 participants across two laboratory sessions, each containing approximately 10K keystrokes. The dataset includes password entries, free text responses, and transcription tasks, with synchronized video recordings of participants’ facial expressions and hand movements. It supports research in behavioral biometrics and keystroke dynamics under controlled experimental settings. |
| Facial Makeup Datasets |
|
GitHub Page |
The Facial Makeup Datasets comprise four collections designed to study the effects of cosmetic makeup on face recognition performance: YMU (YouTube Makeup), VMU (Virtual Makeup), MIW (Makeup in the Wild), and MIFS (Makeup Induced Face Spoofing). These datasets include both real and synthetically modified female face images sourced from online media and controlled repositories, supporting research in makeup-robust face recognition and presentation attack detection. |
| University at Buffalo Dataset for Keystroke Dynamics and Mouse Movements |
|
GitHub Page |
The dataset from University at Buffalo includes keystroke dynamics and mouse movement data collected under transcription and free-text typing conditions. It also captures variations from different keyboard types across sessions. Mouse coordinate and event data complement the typing data to support research in behavioral biometrics and user interaction analysis. More details are available at buffalo.edu/cubs. |
| WVU: Iris Biometric In Difficult Conditions Dataset (IBIDC) |
|
GitHub Page |
The Iris Biometric In Difficult Conditions (IBIDC) dataset, developed by West Virginia University, contains off-axis and angled iris images captured using two cameras — a Sony Cyber Shot DSC-F717 and a monochrome infrared camera. The Sony camera operated in infrared “night vision” mode but retained RGB sensor data, giving the images a green hue. The dataset supports research in iris recognition under challenging capture conditions. |
| Michigan State University: Mobile Face Spoofing Dataset (MFSD) |
|
Dataset Page |
The Mobile Face Spoofing Dataset (MFSD) was developed at Michigan State University to simulate spoof attacks on smartphones using cameras that replicate input received by face-unlock systems like Android’s Trusted Face. It contains 280 video clips of photo and video attack attempts targeting 35 clients. The dataset supports research in presentation attack detection and mobile face authentication security. |
| Michigan State University: Tattoo Sketch and Image Dataset |
|
GitHub Page |
The Tattoo Sketch and Image Dataset, developed at Michigan State University, supports research in tattoo recognition and cross-domain matching between hand-drawn sketches and photographic tattoo images. It enables evaluation of sketch-based retrieval and forensic tattoo identification techniques. More details are available at biometrics.cse.msu.edu. |
| Clarkson University: Nail-to-Nail Prototype Finger Photo Dataset |
|
GitHub Page |
The Nail-to-Nail (N2N) Prototype Finger Photo Dataset, developed at Clarkson University and funded by CITeR, includes 15 photographs acquired around the finger to support nail-to-nail fingerprint analysis. It was released through NIST Special Dataset 302 as part of the N2N Fingerprint Challenge. Researchers can request access directly from NIST. Additional details are available in the GitHub repository. |
| Clarkson University: Liveness Detection Competition Fingerprint and Iris Datasets (LivDet) |
|
GitHub Page |
The Liveness Detection Competition (LivDet) datasets, developed at Clarkson University and funded by CITeR, include fingerprint and iris images collected across multiple years (2009, 2011, 2013, 2015, 2017) featuring both live and spoof samples made from materials such as gelatin, Play-Doh, Ecoflex, paper, and patterned contact lenses. These datasets support research in presentation attack detection and biometric liveness verification. Additional information is available at livdet.org, and related repositories for newer releases (2020, 2021, 2024) can be found in the same GitHub organization. |
| WVU: Synthetic Iris Dataset Collection |
|
GitHub Page |
The Synthetic Iris Dataset Collection, developed at West Virginia University, contains both texture-based and model-based synthetic iris images. The texture-based set includes 1,000 classes with seven samples per class, while the model-based set includes 5,000 subjects with left and right eyes, each having 16 images. These datasets support research on iris synthesis, biometric simulation, and template generation. More details are available at biic.wvu.edu. |
| WVU: Multispectral Ocular Biometrics Collection |
|
GitHub Page |
The Multispectral Ocular Biometrics Collection, developed at West Virginia University, supports research in ocular recognition using multispectral imaging. The dataset captures NIR and RGB images of the eye to analyze sclera texture, vasculature patterns, and periocular cues. It enables studies on enhancing non-frontal iris recognition and integrating multiple spectral channels for improved biometric performance. |
| Clarkson University: Experimental Quality Latent Fingerprint Dataset |
|
GitHub Page |
The Experimental Quality Latent Fingerprint Dataset, developed at Clarkson University in collaboration with SUNY-Canton, contains latent fingerprints captured using multiple devices, lighting conditions, surfaces, and placements. The dataset includes direct surface images, lift images, and lift scans. It was funded by the DHS Science and Technology Directorate and the National Science Foundation, and is available to CITeR members under the Database Release Agreement. |
| Clarkson University: Liveness Detection Competition Iris Dataset 2020 (LivDet) |
|
GitHub Page |
The Liveness Detection Competition Iris Dataset 2020 (LivDet), developed at Clarkson University, contains iris images from live and spoof samples created using materials such as paper and patterned contact lenses. This dataset supports research in iris presentation attack detection and biometric liveness verification. It is available to CITeR members under the Database Release Agreement. |
| Clarkson University: Liveness Detection NonContact Fingerprint Dataset |
|
GitHub Page |
The Liveness Detection NonContact Fingerprint Dataset, developed at Clarkson University, contains 4-finger photo images from live and spoof samples created using ecoflex, gelatin, and Play-Doh. This dataset supports research in noncontact fingerprint presentation attack detection and spoof-resilient biometric systems. |
| Clarkson University: Liveness Detection Competition 2021 Face (LivDet) |
|
GitHub Page |
The Liveness Detection Competition 2021 Face (LivDet) dataset, developed at Clarkson University, contains live and spoof face images captured using DSLR, iPhone X, Samsung Galaxy S9, Google Pixel, and Basler sensors. It includes nine types of presentation attack instruments (PAIs) such as printed paper displays, laptop screen replays, 2D photo masks, 3D printed masks (low, medium, high quality), wearable silicone masks, and video display attacks. The dataset supports research in face presentation attack detection under diverse capture conditions. |
| Clarkson University: Liveness Detection Competition 2024 Face (LivDet) |
|
GitHub Page |
The Liveness Detection Competition 2024 Face (LivDet) dataset, developed at Clarkson University, contains live and spoof face images representing nine presentation attack instrument (PAI) categories. It includes bobblehead models, projection attacks (2D and 3D), half-cloth masks, print and replay attacks, silicon masks, and 3D masks of varying quality. Collected using nine sensors under diverse lighting and device conditions, this dataset supports research on robust face presentation attack detection and cross-sensor generalization. |
| WVU: Contactless Fingerprint Collection I |
|
GitHub Page |
The Contactless Fingerprint Collection I dataset from West Virginia University includes data from 216 participants, featuring hand photos captured with a Canon 5DSR, contactless fingerprint scans from Gemalto/Thales and Morpho Wave devices, contact-based prints from a Crossmatch Guardian sensor, and mobile captures using an iPhone X. The dataset supports research on cross-sensor interoperability and contactless fingerprint biometrics. |
| WVU: Contactless Fingerprint Collection II |
|
GitHub Page |
The Contactless Fingerprint Collection II dataset from West Virginia University includes data from 500 participants captured using multiple contactless and contact-based systems. It features Guardian and Kojak contact prints, Morpho Wave and Gemalto contactless sensors, DSLR hand geometry photos, and smartphone captures (Samsung S20/S21) under both controlled and operational conditions across multiple sessions. This dataset enables research in cross-sensor evaluation and contactless fingerprint interoperability. |
| WVU: ARDEC Multimodal Using Fieldable Devices |
|
GitHub Page |
The ARDEC Multimodal Using Fieldable Devices dataset from West Virginia University includes data from 300 participants collected using both standard biometric sensors and portable multi-biometric devices. It features iris images from iCam TD 100, five-pose face photos (Canon 5DSR), and slap / rolled fingerprints (Crossmatch Guardian). Additional data were acquired using Northrop Grumman’s multi-biometric platform (Galaxy S5 + Kojak) and the Crossmatch SEEK Avenger, capturing faces, irises, and fingerprints under operational, uncontrolled conditions. This dataset supports research in portable biometric systems and cross-platform evaluation. |
| LibriVOC Dataset |
|
GitHub Page |
The LibriVOC Dataset is a large-scale, open-source corpus developed for vocoder artifact detection. Derived from LibriTTS and LibriSpeech datasets, LibriVOC provides benchmark audio samples sourced from LibriVox audiobooks to support research in neural vocoder identification and speech synthesis forensics. |