How Android Malware Is Detected: From App Behaviors to On-Device AI

Quan Vo

When it comes to Android malware detection, the effectiveness of a deep learning model depends less on how complex the model is, and more on what features it learns from. In practice, feature extraction is often the most critical design decision, especially when detection needs to run directly on mobile devices.

Traditional server-side detection works well at scale, but it struggles in real-world mobile scenarios:

📦 Large APK uploads are slow on mobile networks
🔓 Man-in-the-middle risks exist during upload
🧭 Offline installation bypasses cloud checks
⏱️ Real-time defense: malware can be flagged before installation or execution. Users expect instant install, not background scans

That’s where on-device detection becomes the last line of defense – protecting users before a malicious app ever runs.

A Practical Architecture for On-Device Detection

A modern on-device malware detection system follows a two-phase architecture (as illustrated below), separating heavy computation from real-time protection.

1. Server Side (Offline)

This phase is resource-heavy and done once:

Collect benign & malicious APKs
Extract security-relevant features
Train deep learning models (CNN, RNN, GRU or hybrid models)
Convert & quantize the model for mobile
Export as TensorFlow Lite (or LiteRT)

2. Mobile Side (Runtime)

This phase must be fast and lightweight:

Load the quantized model
Extract features from the APK before installation
Run inference locally
Block or warn if malware is detected

No network required. No privacy leakage.

Extract Feature Overview

By combining all categories, a comprehensive static feature set is formed:

Permissions
Intents + actions (services & providers)
Hardware features
API calls
Strings and native libraries

To better understand how malicious behaviors are reflected in static features, we map permissions and component declarations to common malware techniques, as outlined below.

Together, they create a high-dimensional but lightweight representation of application behavior – suitable for deep learning models and efficient enough for on-device inference.

1. AndroidManifest-Based Features

The AndroidManifest.xml file provides a high-level description of what an app intends to do. For malware detection, this is one of the most valuable and lightweight sources of information.

Common manifest-based features include:

Permissions
- Sensitive permissions (SMS, contacts, storage, phone state)
- Privilege escalation patterns
Intent filters, Services, and Providers
- Background services for long-running tasks
- Boot receivers (e.g., auto-start after reboot)
- Inter-app communication signals
Hardware requirements
- Telephony, GPS, camera access

Malware often requests more permissions than necessary, or unusual combinations of permissions that benign apps rarely need. These patterns are highly discriminative and cheap to extract.

2. Strings and Native Libraries

Additional static indicators include:

Hardcoded URLs or IP addresses
Command-and-control patterns
Obfuscated strings
Native .so libraries used to bypass Java-level analysis

These features improve robustness against basic obfuscation techniques.

3. API Call Features (classes.dex)

Beyond declared intent, malware reveals its behavior through API usage. These features are extracted directly from the classes.dex file, without decompiling the APK.

Key categories include:

System-level APIs
- TelephonyManager, SmsManager
- Device ID and SIM information
File and storage APIs
- External storage access
- File I/O patterns
Network APIs
- HTTP connections
- Socket communication
Reflection and dynamic loading
- Class.forName, DexClassLoader

Rather than tracking every API call, on-device systems typically use binary presence or frequency vectors, keeping feature extraction fast and memory-efficient.

4. Opcode-Level Features (Used Selectively)

Some detection systems include opcode n-grams or instruction-level patterns extracted from Dalvik bytecode.

While opcode features can capture low-level behavior, they come with trade-offs:

Higher extraction cost
Larger feature space
Easier to obfuscate with packing or code transformation

For on-device detection, opcode features are often limited or avoided unless carefully optimized.

Final Insight

Android malware detection is not about chasing individual exploits, it’s about recognizing intent expressed through structure.

By carefully selecting static features that map directly to real malicious behaviors, we can bring accurate, privacy-preserving malware detection directly onto mobile devices.

This is where mobile security meets on-device AI – and where it becomes truly scalable.

Solutions

Industry

Our thinking

How Android Malware Is Detected: From App Behaviors to On-Device AI

Quan Vo

Table of Contents

A Practical Architecture for On-Device Detection

1. Server Side (Offline)

2. Mobile Side (Runtime)

Extract Feature Overview

1. AndroidManifest-Based Features

2. Strings and Native Libraries

3. API Call Features (classes.dex)

4. Opcode-Level Features (Used Selectively)

Final Insight

Share this:

Like this:

Related

Quan Vo

Leave a CommentCancel reply

Suggested Article

NashTech

Solutions

Useful links

Connect with us

Our achievements

Discover more from NashTech Blog