Hassan Saif Profile Photo

Hassan Saif

© Copyright 2024 HSAIF. All Rights Reserved.

Dr Hassan Saif

I'm a


Senior Data science manager with over 15 years’ experience in Machine Learning (ML), NLP, Software Engineering, and Big Data, Proficient in leading teams to develop and implement scalable ML models and cloud-based solutions to address a wide range of societal, political, and business challenges. Expertise in financial crime detection, counterterrorism, online radicalisation, sentiment analysis, and user behaviour analysis. Strong experience with stakeholder management, guiding the entire lifecycle of ML-based projects and delivering actionable intelligence and insights to front-line businesses and governmental agencies. Author of a recognized book on NLP and Sentiment Analysis and a contributor of over 30 publications in world-leading journals and conferences.

Hassan Saif's Profile Picture

Lead Data Scientist & Software Engineer

As a tech-savvy innovator, I understand new technologies and how they can be used in various fields. I know Python, R, TensorFlow, and other data science and software engineering tools. This technical expertise helps me align technology with business goals as a strategist.

  • Birthday: 15 August
  • Website: www.hsaif.net
  • Phone: Upon Request
  • City: Milton Keynes
  • Age:38
  • Degree: PhD
  • Email: hsaif@hsaif.net
  • Freelancing: Not Available

I support data-driven decision making, ethical artificial intelligence, and the responsible application of technology. In anticipation of the future, my objective is to spearhead endeavours that not only bring commercial success but also contribute positively to society. Whether providing guidance for a new venture or supervising a group, the primary objective is consistently to deliver practical, evidence-based conclusions.


Significant milestones that have marked my professional trajectory. Leading groundbreaking projects, contributing to to scientific literature, fostering the development of future technology leaders, and receiving international recognition are all aspects that underscore my commitment to excellence and individual growth.

Projects Pioneering Solutions Across Industry & Academia

Publications Covering a diverse range of societal, political, and business topics

Trainees Professionals coached in diverse realms: from tech to emotional intelligence

International Awards Global recognition for innovation, research, and mentoring


A diverse range of skills is necessary for stimulating innovation and promoting development. This portfolio highlights my skill set in facilitating achievements in the realms of personal and technical growth.

Machine Learning 100%100%
Natural Language Processing 99%99%
Software Engineering 95%95%
Big Data 93%93%
Project Management 92%92%
Quantitative & Qualitative Analytics 95%95%
Cloud Computing & Microservices 90%90%
Python 98%98%
Research & Publications 92%92%
Team Leadership & Management 91%91%


By integrating profound academic knowledge with hands-on industry experience, I have established a specialised field in developing groundbreaking solutions utilising Machine Learning, Software Engineering, and Big Data technologies. During my career, I have been involved in a variety of significant initiatives, including efforts to prevent financial misconduct, detect online radicalisation, and fathom the complexities of online user behaviour. My affiliation with esteemed institutions such as HSBC and The European Research Council (ERC) serves as evidence of my dedication and expertise.


Hassan Saif

Lead Data Scientist with over 15 years' experience in Machine Learning, Computational Social Science, Software Engineering, and Big Data

  • Milton Keynes, UK
  • hsaif@hsaif.net


PhD in Computational Data Science & NLP

2011 - 2015

The Open University, Knowledge Media Institue, UK

A study that leveraged Semantic Web techniques together with Machine Learning and NLP to enhance sentiment analysis performance on social media. I designed several methods to enrich both supervised and unsupervised Machine Learning models with the conceptual and contextual semantics of words. Additionally, I developed SentiCircles, a semantic vector representation of words that uses Trigonometry and Euclidian Geometry to better assign sentiment to words based on their contextual and conceptual meanings. The scientific community has made extensive use of SentiCircles to improve both traditional and deep learning models for sentiment analysis.

The doctoral thesis was honoured with the prestigious International Semantic Web Distinguished PhD Thesis award (Japan, Kobe 2016) and was later published as a book in 2017.

B.Sc. in Computer Science & Artificial Intelligence

2003 - 2008

Damascus University, , Damascus, Syria

Studied key courses including Machine Learning, Deep Neural Networks, Programming Languages, Game Theory, Applied Mathematics, and Software Engineering.

Technical Skills

Big Data
  • Strong experience with managing and processing large production databases using on-premise and cloud-based data solutions including, the Hadoop ecosystem (YARN, HDFS, MapReduce, Hive, Avro) and Google Cloud Datastores.
  • Experience with building and managing ETL and data flow pipelines using DBT
  • Good knowledge of RMDBs such as MySQL and SQL Server.
  • Familiarity with Cloud services particularly Amazon Web Services and Google Cloud Platform and associated analytics and ML capabilities (TFX, GKE, Kubeflow).
  • Experience with building event-driven microservices architectures using Apache Kafka.
Analytics & Machine Learning
  • Strong experience in quantitative and qualitative analytics, ML (pattern recognition, classification, clustering, dimensionality reduction, features engineering, and model evaluation)
  • Strong Experience in Natural Language Processing, Text Mining, Semantic Vector representations, Topic Modelling, and User Behaviour Analysis.
  • Familiarity with Semantic Web modelling, toolkits, and ontologies including, Semantic Graphs, RDF, OWL, DBpedia, etc.
  • Experience in ML and Deep Learning frameworks and packages including, scikit-learn, TensorFlow, Keras, TFLearn, Jupyter, Pandas, NumPy, etc.
  • Data visualization with Matplotlib, Pyplot, Seaborn, and ggplot2.
Software Design and Engineering
  • Strong experience in Python, and previous knowledge of R, Java, C++, C#, PHP, and JavaScript.
  • Good knowledge of functional and object-oriented programming.
  • Software design, development, testing, deployment, and architecture design.
  • Experience with development CI/CD and version control tools, including Git, Gerrit, and Travis.
  • Good knowledge of web development packages and frameworks such as Dash and the MEAN stack (MongoDB, Express Angular, and Node.js).

International Awards

  • International SWSA Distinguished PhD Dissertation Award, “Semantic Sentiment Analysis of Microblogs”, ISWC, Kobe, Japan, Oct 2016.
  • Best Research Paper Award (Nominee), “Mining Pro-ISIS Radicalisation Signals from Social Media Users”, ICWSM, Cologne, Germany, 2016.
  • Best Journal Paper of the Year (Honourable Mention), “Contextual semantics for sentiment analysis of Twitter”, Information Processing and Management, 2015.
  • Best Research Paper Award, “Adapting sentiment lexicons using contextual semantics for sentiment analysis of Twitter” 1st workshop on Semantic Sentiment Analysis. Crete, Greece, 2014.
  • Best Research Paper Award, “Alleviating Data Sparsity for Twitter Sentiment Analysis”, 2nd workshop on Making Sense of Microposts. Leon, France, 2012.

Selected Keynotes & Invited Talks

  • Extracting Policing-related Evidence from Social Media Data: Northumbria Police Centre, Newcastle upon Tyne, UK, May 2017
  • Radicalisation Detection on Social Media: Social Media and Policing Conference, Milton Keynes, UK, February 2017.
  • Semantic Sentiment Analysis in Social Streams: International Semantic Web Conference. Kobe, Japan, October 2016.

Selected Publications

Semantic Sentiment Analysis in Social StreamsStudies on the Semantic Web, IOS Press, June 2017
Mining Pro-ISIS Radicalisation Signals from Social Media UsersJournal of Web Semantics, 2016
Contextual Semantics for Sentiment Analysis of TwitterInformation Processing and Management, 2015
Semantic Sentiment Analysis of TwitterInternational Semantic Web Conference, 2012

Professional Experience

Senior Data Science Manager

2022 - Present

HSBC Group, Financial Crime Digital Enablement, Chief Compliance Office, London, UK

I head a multinational team of data scientists, software engineers and system architects to develop cutting-edge, scalable AI-powered solutions within the HSBC’s Financial Crime Digital Enablement function. My role involves leveraging advanced technologies such as NLP, GenAI and Large Language Models to optimise and automate manual complex processes such as model and policies documentation and financial crime investigations. I manage stakeholder relationships across various sectors, including financial crime, model risk, and name screening, ensuring effective communication, collaboration, and presentation of complex technical developments in AI and ML to stakeholders in a clear manner. Key responsibilities include:

  • Leading the design and development of AI-driven solutions to streamline HSBC's manual ML models and policies documentation and review processes.
  • Implementing Agile methodologies for product and team management, ensuring prompt delivery and adaptability to evolving requirements.
  • Facilitating collaboration among IT, Analytics, and Software Engineering functions to ensure that the team’s deliverables are in line with the organisation’s strategic initiatives in financial crime prevention and risk management.
  • Overseeing the complete project management lifecycle for our team’s cloud-based ML solutions, from gathering requirements to deployment, while effectively handling stakeholder and client interactions to ensure project objectives meet business needs and expectations.

Senior Data Scientist

2018 - 2022

HSBC Group, Research & Development, Chief Compliance Office, London, UK

Lead the design and development of innovative, large-scale, and production-ready anti-financial crime systems. This is an interdisciplinary role at the intersection of quantitative analysis/ML, software development, and pipeline infrastructure engineering. I am also responsible for managing multiple cross-jurisdiction analytics teams, as well as preparing and delivering a broad range of technical briefings for a range of senior HSBC executives. Key responsibilities and achievements include:

  • Managing the design, development, and delivery of HSBC's global anti-financial crime ML-based ecosystems.
  • Working with extremely large, complex data sets (+30Bn records), generating insights, and identifying and modelling historical and new emerging financial crime risk behaviours and trends.
  • Building an end-to-end NLP-powered ML pipeline for identifying confirmed financial crime risk typologies within a global population of the HBSC known-bad actors. The developed pipeline helped deliver auto-generated gold-standard evaluation datasets for model training and evaluation.
  • Building ML-based anti-money laundering models that are ~5 times better and detect risk ~0.7 months faster than existing systems.
  • Working jointly with IT and engineering teams on the integration and deployment of large-scale ML pipelines.
  • Leading model performance analysis and evaluating business metrics to generate actionable insights and provide compelling visualisations for both business and technical stakeholders to support evidence-based decision-making processes within the bank.
  • Managing key stakeholder relationships to determine and translate business needs into research requirements and development workflows.

Lead Data Scientist

2017 - 2018

True212, Engineering Dept, , London, UK

Led the design and development of project RER (Real-Time Editorial Resource), a recommender system that provides journalists and editors with real-time insights, recommendations, and monitoring services on social media. Main responsibilities included:

  • Design scalable ML and NLP pipelines (NER, WSD, Topic Modelling, and Semantic Vector Representations) for identifying relevant insights, trends, and breaking news from social media.
  • Using Big Data technologies and event streaming frameworks (Hadoop, Spark, Kafka) along with cloud computing services (AWS, Atlas) to design and build an event-driven and fault-tolerant system architecture for the RER's ML framework.
  • Managing the RER software development lifecycle across internal and overseas teams.

ML Research Associate

2015 - 2017

The Open University, , Knowledge Media Institue, Milton Keynes, UK

Led and managed two EU-funded research and innovation projects: COMRADES and TRIVALENT:

  • In TRIVALENT, I worked closely with over 20 research teams and policing organizations across Europe to target the problem of online radicalisation and counter-terrorism on social media. This involved leveraging techniques from ML, User Behavior Analysis, and Big Data to build a large-scale ML framework for detecting radicalized users on social media, tracking the divergence of user behaviours during the radicalisation process, and analyzing what influences users to adopt a radicalized stance. Evaluation of over 100 million tweets showed that the developed models produce ~7% better accuracy than existing models.
  • In COMRADES, I designed a semantically enriched Deep Learning model for crisis-related event detection on social media. The developed model showed to outperform existing ML models by up to 3.7%. This model is currently used by multiple international humanitarian organizations to identify crisis-related information in social streams

Research Assistant

2014 - 2015

The Open University, , Knowledge Media Institue, Milton Keynes, UK

Leveraged NLP, Machine Learning, Topic Modelling, and Semantic Web techniques to develop prediction models to capture the dynamics of citizen's policy discussions and the spread of polarised sentiment on social media. These models have been used by a wide range of political scientists and governmental organisations (including Members of Parliament in both Germany and the UK) as part of the EU-funded project SENSE4US.

    Software Engineer (PT)

    2013 - 2013

    The Open University, , Knowledge Media Institue, Milton Keynes, UK

    Developed a web-based framework for assessing the quality of research publications within the Open University (OU) as part of the UK Research Excellence Framework (REF). This framework has been continuously used to assess the research performance of the OU and help aid the research grant allocations from multiple Higher Education funding bodies. Programmed in PHP, JavaScript, MySQL, and SPARQL.


      Drawing upon my profound knowledge in the fields of software engineering and machine learning, I provide consulting services and accept limited freelance projects. Should you require assistance with an AI project, encounter a complicated software dilemma, or simply want to discuss the miracles of technology, I am readily available to participate. Dial in to discuss the possibility of a collaboration.


      Milton Keynes, United Kingdom




      +44 75317 133398