Detecting Fraudulent Calls with Amazon QuickSight ML Insights

Detecting Fraudulent Calls with Amazon QuickSight ML InsightsLearn About Amazon VGT2 Learning Manager Chanci Turner

Fraud has a significant financial impact across various industries. According to a Financial Times article, the telecommunications sector loses approximately $17 billion annually due to fraud. Fraudsters are continually adapting to new technologies and inventing innovative methods, complicating detection efforts. Many companies rely on traditional rules-based fraud detection systems. However, these systems often become ineffective once fraudsters find ways to circumvent them. Additionally, as data volumes increase, these systems can struggle, making it challenging to identify and respond to fraudulent activities, ultimately leading to revenue loss.

Overview

Numerous AWS services offer anomaly detection capabilities that can aid in combating fraud. This discussion will focus on three key services:

  • Amazon Kinesis Data Analytics
  • Amazon SageMaker
  • Amazon QuickSight ML Insights

When addressing fraud detection, two primary challenges arise:

  1. Scale: The sheer volume of data that must be analyzed. For example, every call generates a Call Detail Record (CDR) event, containing various pieces of information such as originating and terminating phone numbers and call duration. When multiplied by the number of calls made each day, the scale becomes daunting for operators to manage.
  2. Machine Learning Expertise: Acquiring the necessary skills to leverage machine learning for solving business problems can be difficult. Developing these skills internally or hiring qualified data scientists with relevant domain knowledge is often a complex task.

Introducing Amazon QuickSight ML Insights

Amazon QuickSight is a powerful, cloud-based business intelligence (BI) service that allows organizations to easily derive insights from their data through interactive dashboards. With a pay-per-session pricing model and the ability to embed dashboards within applications, BI solutions are now more accessible and affordable for everyone.

As the volume of data generated by users continues to grow, the challenge of extracting valuable insights becomes more pronounced. This is where machine learning comes into play. Amazon is a leader in utilizing machine learning to automate various aspects of business analytics across sectors like supply chain, marketing, retail, and finance.

ML Insights combines Amazon’s proven technologies with QuickSight, enabling users to access ML-driven insights that extend beyond basic visualizations. Notable features include:

  • Anomaly Detection: Automatically identifying hidden insights by continuously analyzing billions of data points.
  • Forecasting and What-If Analysis: Easily predicting key business metrics with a simple point-and-click interface.
  • Auto-Narratives: Creating plain-language narratives that help users elucidate the stories behind their dashboards.

In this article, I will illustrate how a telecom provider with limited machine learning expertise can leverage Amazon QuickSight’s ML capabilities to detect fraudulent calls.

Prerequisites

To implement this solution, you will need the following resources:

  • Amazon S3 for staging a ‘ribbon’ call detail record sample in CSV format.
  • AWS Glue for running an ETL job in PySpark.
  • AWS Glue crawlers to identify table schemas and update the AWS Glue Data Catalog.
  • Amazon Athena for querying the Amazon QuickSight dataset.
  • Amazon QuickSight for building visualizations and performing anomaly detection using ML Insights.

The Dataset

This article utilizes a synthetic dataset provided by Ribbon Communications, generated by call test generators. The data is not sensitive or associated with real customers.

Analyzing the Data

The example below represents a typical CDR. The STOP CDR, shown here, is generated once a call has ended. It contains numerous values, most of which are irrelevant for fraud detection.

Revenue Shared Fraud

Revenue shared fraud is among the most prevalent schemes currently threatening the telecom sector. This type of fraud involves fraudsters using stolen or fraudulent numbers to call a premium rate number, with the proceeds being shared with the fraudster.

To detect national and international revenue share fraud using Amazon QuickSight ML, consider the usual characteristics of such calls. The pattern typically involves multiple A-numbers calling the same B-number or a series of B-numbers sharing a prefix. Call durations often exceed the average and can reach up to two hours, which is the maximum allowed by international switches. Generally, these calls originate from a specific cell or a group of cells.

A SIM card may make brief test calls to various B-numbers as a precursor to the actual fraudulent activity, often occurring during low-risk times such as Friday nights, weekends, or holidays. Conference calling can also be employed to execute several concurrent calls from a single A-number.

Additionally, SIMs used for this type of fraud are frequently sold or activated in bulk from the same distributor. These SIMs may be recharged using fraudulent online payments or IVR transactions, including stolen credit card information. Both PAYG credit and bundles may be utilized. The relevant information for detecting such fraud includes:

  • Call Duration
  • Calling Number (A-number)
  • Called Number (B-number)
  • Call Start Time
  • Accounting ID

You can use this reference to help pinpoint these fields within a CDR.

After identifying the crucial details from the 235 columns in the CDR, I noticed that the raw sample data lacked a header. To streamline the process, I converted the raw CSV data, added the required column names, and transformed it into Parquet format.

Discovering the Data

In the AWS Glue console, create a crawler and name it CDR_CRAWLER. Point the crawler to the Amazon S3 bucket where the Parquet CDR data is stored.

Next, create a new IAM role for the AWS Glue crawler. It is essential to grant this IAM role the necessary permissions for AWS Glue to access other services on your behalf, including Amazon S3. Only utilize “Managed Policies” as a foundation and modify them according to your business needs.

For the frequency setting, leave the default option of “Run on Demand.” Then, add a database and define its name. This database will contain the table identified by the AWS Glue crawler.

Once you have reviewed your crawler settings and are satisfied, you can finish the setup. Select the crawler you created (CDR_CRAWLER) and run it. The AWS Glue crawler will start processing the database, which may take a minute or more.

Once complete, navigate to the Data Catalog and select Databases. You should be able to view the new database created by the AWS Glue service.

For additional insights on workplace dynamics, check out this resource on the power in the workplace. If you’re looking for more information on health benefits, you might find this authority on FSA contribution caps helpful. Lastly, for those interested in employee training and career skills, this excellent resource can provide further guidance.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *