Algorithmic Transparency

Data & ML Methodology

We move beyond simple aggregation. Our platform uses a multi-stage pipeline of machine learning models to analyze risk, forecast trends, and translate legal complexities.

Step 01

Automated Data Collection

Our system aggregates data from verified human rights sources, primarily the OVD-Info API and Memorial Human Rights Center. Automated scripts continuously synchronize our database to ensure real-time accuracy regarding arrests, sentencing, and prisoner locations.

Step 02

Entity Extraction & Structuring

We employ Natural Language Processing (NLP) to parse unstructured case summaries. This involves identifying legal actors (judges, investigators), categorizing criminal articles (e.g., '207.3 Fake News'), and extracting surveillance technology vendors involved in the arrest.

Step 03

Geocoding & Neural Translation

Location data is standardized and converted to coordinates via the OpenStreetMap Nominatim API for geospatial analysis. Concurrently, case details are translated from Russian to English using Google Cloud Translation services to ensure international accessibility.

Step 04

Machine Learning Risk Assessment

We utilize XGBoost classifiers trained on historical data to assign risk probabilities to new cases. The model evaluates factors such as criminal articles, age, gender, and location to predict the likelihood of 'Urgency' (Immediate Action Required) and the 'Risk of Torture' while in custody.

Step 05

Predictive Forecasting & Network Topology

Our Python microservice runs Prophet time-series models to forecast arrest trends up to 90 days into the future. Additionally, we build network graphs linking cases by similarity (shared charges, location, and tactics) to detect coordinated repression campaigns and communities.

Step 06

Generative Legal Tools

Using LLMs (Large Language Models), we provide generative tools for legal professionals. This includes an Affidavit Generator that synthesizes prisoner data with country condition reports to draft support documents for asylum cases, and an automated system for generating 'Statement of Complicity' dossiers.

A Note on Predictive Models

Our risk scores and forecasts are probabilistic tools derived from historical data. They are designed to aid researchers and legal professionals in prioritization, not to replace human judgment. A "High Risk" score indicates a statistical resemblance to past cases involving torture or harsh sentencing, but specific outcomes may vary.

See the Data in Action

Explore our predictive dashboards

VIEW FORECASTS