- decrypt.day
- IPA Archive
- AppCake
- Appdb(original + modified)
- Sideload apps not on App Store
- AppleSiliconUIKitPatch
- Decrypter on macOS when SIP-enabled (macOS 11.3 or below)
Some notes, tools, and techniques for reverse engineering iOS/iPadOS binaries.
CheckRa1n was enough for my devices on iOS 14.x.
Older, stable Jailbreaks like Electra still work. Things to remember:
Cydia Impactor
to install a fresh copy of the Electra app.# find your "Apple Development" ID
security find-identity -v -p codesigning
# sign Electra app with a free Developer Account
applesign -7 -i ${CODESIGNID} -m embedded.mobileprovision Electra1141-2.0.ipa -o ready.ipa --clone-entitlements
# Deploy it to the device with a different bundle ID
ios-deploy --bundle_id='com.bar.baz.foo' -b ready.ipa
Decrypting the app binary is essential if you want to find good strings, debug the app or repackage the iPA.
# Get script to decrypt iPA
https://github.com/AloneMonkey/frida-ios-dump
# Attach a jailbroken iPhone and create tunnel over USB
iproxy 2222 22 &
# Ensure Frida is running on iOS device. Then run frida-ios-dump
./dump.py foo.bar.bundleid
# Check AppStore binary is now decrypted ( cryptid 0 decrypted vs cryptid 1 encrypted )
otool -l Payload/foo.app/foo | grep -i LC_ENCRYPTION -B1 -A4
Load command 12
cmd LC_ENCRYPTION_INFO_64
cryptid 0
--
Apple Configurator 2
and "sign in" with the same Apple account used on the target deviceAdd/Apps
At this point Apple Configurator 2
will download a copy of the app to:
~/Library/Group Containers/K36BKF7T3D.group.com.apple.configurator/Caches/Assets/TemporaryItems/MobileApps/
When you hit the "Skip App / Replace / Stop"
modal, select nothing. Go to Finder and grab the IPA.
Works on a clean device or Jailbroken device:
# Install Objection
pip3 install objection
# repackage app with Frida Gadget
objection --gadget "com.apple.AppStore" explore
# KeyChain dump
ios keychain dump --json output.json
# Unzip the IPA file to reveal the Payload folder
unzip myApp.ipa
# big files inside ipa file
find Payload -size +2M
# Files that were mistakingly shipped inside of App Bundle
find . -name '*.json' -or -name '*.txt'
# Check for ReactNative
find . -name main.jsbundle
# Check for Certificates
find . -name '*.crt' -or -name '*.cer' -or -name '*.der'
# Property lists inside Payload folder. Recursive search.
find Payload/ -name '*.plist'
# Provisioning Profiles
find . -name '*.mobileprovision'
# Dynamically linked frameworks
find . -name '*.framework'
# Locally linked javascript
find Payload -name '*.js'
# Search all plist files for a value
find . -name '*.plist' | xargs grep "LSApplicationQueriesSchemes"
# Search all plist files for Device Permissions or App Transport Security
find . -name '*.plist' | xargs grep "NS"
# Search all files using only grep
grep "LSApplicationQueriesSchemes" . -R
# Recursive search all files using grep inside Payload folder
grep "Requires" Payload -R
# foobar.app/Info.plist: <key>UIRequiresFullScreen</key>
# foobar.app/Info.plist: <key>LSRequiresIPhoneOS</key>
# Sandbox. Look here for Cookies, json files, etc
/var/mobile/Containers/Data/Application/[GUID given at install time]/
# Folder of App Bundle that was installed. Executables, frameworks, fonts, CSS, html. NIB files.
/private/var/containers/Bundle/Application/[GUID given at app install]/foo.app
# App executable
/private/var/containers/Bundle/Application/[GUID given at app install]/foo.app/foo
# freshly installed IPA is at the bottom of list
cd /private/var/mobile/Containers/Data/Application/ && ls -lrt
cd [app guid]/Documents/
cd [app guid]/Library/
# Databases to pull off a device
/private/var/Keychains
TrustStore.sqlite3
keychain-2.db
pinningrules.sqlite3
# Extract IPA (whether App Store encrypted or not)
scp -r -P 2222 root@localhost:/var/containers/Bundle/Application/<app GUID>/hitme.app ~/hitme.app
# Different to SSH, the uppercase P for Port with SCP. Order important.
scp -P 2222 root@localhost:/var/root/overflow.c localfilename.c
# from Jailbroken device to local machine
# Caution:no space after the root@localhost: Otherwise you copy the entire filesystem!
scp -P 2222 root@localhost:/private/var/mobile/Containers/Data/Application/<App GUID>/Library/Caches/Snapshots/com.my.app
# from local machine to remote Jailbroken device
scp -P 2222 hello.txt root@localhost:/var/root/
# physical device
idevicesyslog -u <DeviceID> | myPipedProgram
# Get logs from iOS Simulator
xcrun simctl spawn booted log stream --level=debug
# Get logs from iOS Simulator by App Name
xcrun simctl spawn booted log stream --predicate 'processImagePath endswith "MyAppName"'
lipo -info libprogressbar.a
jtool -arch arm64 -L <binary inside app bundle>
jtool -arch arm64 -l <binary inside app bundle>
rabin2 -H playground
objdump -macho -section-headers Payload/myApp.app/myApp
codesign -d --entitlements :- Payload/MyApp.app
jtool -arch arm64 --ent <binary inside app bundle>
cat Payload/*/Info.plist | grep -i NS
https://gist.github.com/adamawolf/3048717
rabin2 -I -a arm_64 <binary inside app bundle> | grep -E 'stripped|canary'
rabin2 -I -a arm_64 <binary inside app bundle> | grep -E 'pic|bits
otool -l libprogressbar.a | grep __LLVM
otool -arch arm64 -l tinyDynamicFramework | grep __LLVM
// Remember this command won't work on a locally built Simulator / iPhone app. Bitcode happens after setting `Archive`
nm libprogressbar.a | less
rabin2 -s file
is~FUNC
strings <binary inside app bundle> | grep -E 'session|https'
strings <binary inside app bundle> | grep -E 'pinning'
rabin2 -qz <binary inside app bundle> // in Data Section
rabin2 -qzz <binary inside app bundle> // ALL strings in binary
jtool -dA __TEXT.__cstring c_playground
Dumping C-Strings from address 0x100000f7c (Segment: __TEXT.__cstring)..
Address : 0x100000f7c = Offset 0xf7c
0x100000f7c: and we have a winner @ %ld\r
0x100000f98: and that's a wrap folks!\r
Applesign
is a wrapper around Codesigning
tools from Apple.
npm install -g applesign
#### Create provisioning file
First, you want to get hold of an `embedded.mobileprovision` file. Fear not, this step is simple.
Open `Xcode` and select `File/New/Project/Swift` and call it `foobar`. Select `build` for Generic (ARM) Device. Do not select a simulator. This is normally enough.
You don’t need to `run` the app unless want to automagically add your device’s UUID to the Provisioning Profile.
Now right click on the `/Product/foobar.app` - in the left hand view pane - and select "show in finder". If you look inside the folder ( remember `foobar.app` is a folder ) you will find a fresh `embedded.mobileprovision`. This contains the uniques IDs and an expiry date for the developer profile associated to the app.
#### Read the Provisioning Profile
Ensure your device ID is in the profile and the profile is fresh.
`security cms -D -i embedded.mobileprovision`
#### List all of your Code signing identities
```bash
security find-identity -v -p codesigning
export CODESIGNID=<GUID>
applesign -7 -i ${CODESIGNID} --bundleid funky-chicken.resigned
applesign -7 -i ${CODESIGNID} -m embedded.mobileprovision unsigned.ipa -o ready.ipa
applesign -7 -i ${CODESIGNID} myapp.ipa -o resigned.ipa
rm -v unsigned.ipa | rm -v ready.ipa | 7z a unsigned.ipa Payload
// Keep original Bundle ID
applesign -7 -i ${CODESIGNID} -m embedded.mobileprovision unsigned.ipa -o ready.ipa
// Set Bundle ID
// applesign -7 -i ${CODESIGNID} -b yd.com.rusty.repackaged -m embedded.mobileprovision unsigned.ipa -o ready.ipa
ios-deploy -b ready.ipa
ios-deploy -b myapp-resigned.ipa // defaults to send over wifi
ios-deploy -b -W myapp-resigned.ipa // uses USB
ios-deploy -B | grep -i funky // list Bundle IDs
Title | Detail |
---|---|
Missing Device ID | Check Provisioning Profile (embedded.mobileprovision ) included device's UUID |
Check code sign key has not expired | Code Signing keys expire. The timeframe for the paid iOS Developer license is one-year. For the free developer signing key, it is much shorter. |
Wrong Code-Signing Key | check the Code Signing Key was NOT an iPhone Distribution key |
identity is no longer valid | Error 0xe8008018: The identity used to sign the executable is no longer valid. Make sure that the Apple Development key was selected when running security find-identity -v -p codesigning , I hit this error when I selected a Developer ID Application . I should have selected the ID associated to Apple Development credential. |
Code Signing Keys Match | check the Code Signing Key used when creating the Provisioning Profile matched the Code Signing Key selected when repackaging and code signing. |
XCode check | When generating an app - to get hold of embedded.mobileprovision file - remember the Code signing options are different for each Project Target and ProjectTests. |
Delete Old Apps | check no old app is installed on the phone [ that was signed with a different key ] but has the same Bundle ID. |
Entitlements overload | You can have a Provisioning Profile (embedded.mobileprovision) that contained more Capabilities than the app you are re-signing. |
Clone Entitlements | When the app is complicated, with many entitlements, sometimes it is easier just to --clone-entitlements with Applesign . |
Wrong Bundle ID | When you add specific Entitlments you need a unique Bundle ID. Check whether you need to change Bundle ID when re-signing. |
Network settings | Settings\General\Profiles and Device Management to trust the Developer Profile and App. This won't happen if you are manually proxying or setting a local DNS server., when installing with iOS-deploy . |
If none of the above work open Console.app
on macOS. Select your device and set process:mobile_installation_proxy
in the Search Bar
. This will give details behind the sideloaded IPA error message.
#### update host machine
pip3 install --upgrade frida
# list available devices
frida-ls-devices
# list processes and bundle ID from USB connected device
frida-ps -Uai
# Force open Calender on USB attached device
frida -U -f com.apple.mobilecal
# open foobar over usb and force start. starts app running
frida -U -f com.apple.mobilecal --no-pause
# get the target app's process ID from USB connected device
frida-ps -U | grep -i myapp
# Run script and quit Frida
frida -U -f foobar --no-pause -q --eval 'console.log("Hi Frida");'
Since Frida version ~12.7
, it was quick and simple to Frida on a Jailed device:
# Get Frida-Gadget
<https://github.com/frida/frida/releases>
# Unzip
gunzip frida-gadget-12.xx.xx-ios-universal.dylib.gz
# Create directory for Frida-Gadget
mkdir -p ~/.cache/frida
# Move Frida-Gadget
cp frida-gadget-12.xx.xx-ios-universal.dylib ~/.cache/frida/gadget-ios.dylib
# Invoke Frida-Gadget on Clean device
frida -U -f funky-chicken.debugger-challenge
frida -U "My App" // Attach Frida to app over USB
Process.id
419
Process.getCurrentThreadId()
3843
var b = "hello frida"
console.log(b)
"hello frida"
c = Memory.allocUtf8String(b)
"0x1067ec510"
Memory.readUtf8String(c)
"hello frida"
console.log(c)
0x1067ec510
console.log(c.readUtf8String(5))
hello
console.log(c.readUtf8String(11))
hello frida
ptrToC = new NativePointer(c);
"0x1067ec510"
console.log(ptrToC)
0x1067ec510
console.log(ptrToC.readCString(8))
hello fr
Memory.readUtf8String(ptrToC)
"hello frida"
Objective-C's syntax includes the :
and @
characters. These characters were not used in the Frida Javascript API
.
// Attach to playground process ID
frida -p $(ps -ax | grep -i -m1 playground |awk '{print $1}')
ObjC.available
true
ObjC.classes.UIDevice.currentDevice().systemVersion().toString()
"11.1"
ObjC.classes.NSBundle.mainBundle().executablePath().UTF8String()
ObjC.classes.UIWindow.keyWindow().toString()
RET: <WKNavigation: 0x106e165c0>
// shows Static Methods and Instance Methods
ObjC.classes.NSString.$ownMethods
ObjC.classes.NSString.$ivars
var myDate = ObjC.classes.NSDate.alloc().init()
console.log(myDate)
2019-04-19 19:03:46 +0000
myDate.timeIntervalSince1970()
1555700626.021566
myDate.description().toString()
"2019-04-19 19:03:46 +0000"
var a = ObjC.classes.NSUUID.alloc().init()
console.log(a)
4645BFD2-94EE-413D-9CE5-8982D41ED6AE
a.UUIDString()
{
"handle": "0x7ff3b2403b20"
}
a.UUIDString().toString()
"4645BFD2-94EE-413D-9CE5-8982D41ED6AE"
var b = ObjC.classes.NSString.stringWithString_("foo");
b.isKindOfClass_(ObjC.classes.NSString)
true
b.isKindOfClass_(ObjC.classes.NSUUID)
false
b.isEqualToString_("foo")
true
b.description().toString()
"foo"
var c = ObjC.classes.NSString.stringWithFormat_('foo ' + 'bar ' + 'lives');
console.log(c)
foo bar lives
var url = ObjC.classes.NSURL.URLWithString_('www.foobar.com')
console.log(url)
www.foobar.com
url.isKindOfClass_(ObjC.classes.NSURL)
true
console.log(url.$class)
NSURL
var b = ObjC.classes.NSString.stringWithString_("foo");
var d = ObjC.classes.NSData
d = b.dataUsingEncoding_(1) // NSASCIIStringEncoding = 1, NSUTF8StringEncoding = 4,
console.log(d)
<666f6f> // This prints the Hex value "666f6f = foo"
d.$className
"NSConcreteMutableData"
var x = d.CKHexString() // get you the Byte array as a Hex string
console.log(x)
666f6f
x.$className
"NSTaggedPointerString"
var newStr = ObjC.classes.NSString.stringWithUTF8String_[d.bytes]
// demoapp is the iOS app name
myapp=$(ps x | grep -i -m1 demoapp | awk '{print $1}')
frida-trace -i "getfsent*" -p $myapp
// Connect to process with Frida script
frida --codeshare mrmacete/objc-method-observer -p 85974
Process.enumerateModules()
// this will print all loaded Modules
Process.findModuleByName("libboringssl.dylib")
{
"base": "0x1861e2000",
"name": "libboringssl.dylib",
"path": "/usr/lib/libboringssl.dylib",
"size": 712704
}
Process.findModuleByAddress("0x1c1c4645c")
{
"base": "0x1c1c2a000",
"name": "libsystem_kernel.dylib",
"path": "/usr/lib/system/libsystem_kernel.dylib",
"size": 200704
}
DebugSymbol.fromAddress(Module.findExportByName(null, 'strstr'))
{
"address": "0x183cb81e8",
"fileName": "",
"lineNumber": 0,
"moduleName": "libsystem_c.dylib",
"name": "strstr"
}
Module.findExportByName(null, 'strstr')
"0x183cb81e8"
Module.getExportByName(null,'strstr')
"0x183cb81e8"
Process.findModuleByAddress("0x183cb81e8")
{
"base": "0x183cb6000",
"name": "libsystem_c.dylib",
"path": "/usr/lib/system/libsystem_c.dylib",
"size": 516096
}
a = Process.findModuleByName("Reachability")
a.enumerateExports()
....
{
"address": "0x102fab020",
"name": "ReachabilityVersionString",
"type": "variable"
},
{
"address": "0x102fab058",
"name": "ReachabilityVersionNumber",
"type": "variable"
}
....
...
..
frida -U -f funky-chicken.debugger-challenge --no-pause -q --eval 'var x={};Process.enumerateModulesSync().forEach(function(m){x[m.name] = Module.enumerateExportsSync(m.name)});' | grep -B 1 -A 1 task_threads
"address": "0x1c1c4645c",
"name": "task_threads",
"type": "function"
frida -U -f funky-chicken.debugger-challenge --no-pause -q --eval 'var x={};Process.findModuleByAddress("0x1c1c4645c");'
{
"base": "0x1c1c2a000",
"name": "libsystem_kernel.dylib",
"path": "/usr/lib/system/libsystem_kernel.dylib",
"size": 200704
}
[objc_playground]-> var a = ObjC.classes.NSString.stringWithString_("foo");
[objc_playground]-> a.superclass().toString()
"NSString"
[objc_playground]-> a.class().toString()
"NSTaggedPointerString"
// PASTE THIS CODE INTO THE FRIDA INTERFACE...
Interceptor.attach(ObjC.classes.NSTaggedPointerString['- isEqualToString:'].implementation, {
onEnter: function (args) {
var str = new ObjC.Object(ptr(args[2])).toString()
console.log('[+] Hooked NSTaggedPointerString[- isEqualToString:] ->' , str);
}
});
// TRIGGER YOUR INTERCEPTOR
[objc_playground_2]-> a.isEqualToString_("foo")
[+] Hooked NSTaggedPointerString[- isEqualToString:] -> foo
1 // TRUE
[objc_playground_2]-> a.isEqualToString_("bar")
[+] Hooked NSTaggedPointerString[- isEqualToString:] -> bar
0 // FALSE
// frida -U -l open.js --no-pause -f com.yd.demoapp
// the below javascript code is the contents of open.js
var targetFunction = Module.findExportByName("libsystem_kernel.dylib", "open");
Interceptor.attach(targetFunction, {
onEnter: function (args) {
const path = Memory.readUtf8String(this.context.x0);
console.log("[+] " + path)
}
});
try {
var targetFunctPtr = Module.findExportByName("YDAppModule", "$s9YDAppModule17ConfigC33publicKeyVerifyCertsSayypGvpfi");
if (targetFunctPtr == null) {
throw "[*] Target function not found";
}
Interceptor.attach(targetFunctPtr, {
onLeave: function(retval) {
var array = new ObjC.Object(retval);
console.log('[*]ObjC Class Type:\t' + array.$className);
return retval;
}
});
console.log("[*] publicKeyVerifyCertificates called ");
}
catch(err){
console.log("[!] Exception: " + err.message);
}
frida-trace --v // check it works
frida-trace --help // excellent place to read about Flags
frida-trace -f objc_playground // spawn and NO trace
frida-trace -m "+[NSUUID UUID]" -U "Debug CrackMe" // trace ObjC UUID static Class Method
frida-trace -m "*[ComVendorDebugger* *]" -U -f com.robot.demo.app // ObjC wildcard trace on Classes
frida-trace -m "*[YDDummyApp.UserProfileMngr *]" -U -f com.robot.demo.app // Trace mangled Swift functions
Instrumenting functions...
/* TID 0x403 */
1128 ms -[YDDummyApp.UserProfileMngr init]
1130 ms -[YDDummyApp.UserProfileMngr .cxx_destruct]
frida-trace -i "getaddrinfo" -i "SSLSetSessionOption" -U -f com.robot.demo // trace C function on iOS
frida-trace -m "*[*URLProtection* *]" -U -f com.robot.demo // for https challenge information
frida-trace -m "*[NSURLSession* *didReceiveChallenge*]" -U -f com.robot.demo // check whether https check delegate used
frida-trace -U -f com.robot.demo.app -I libsystem_c.dylib // Trace entire Module. Bad idea!
frida-trace -p $myapp -I UIKit // Trace UIKit Module. Bad idea.
frida-trace -f objc_playground -I CoreFoundation // Trace CoreFoundation Module. Terrible idea.
frida-trace -I YDRustyKit -U -f com.yd.mobile // Trace my own module.
frida-trace -m "-[NSURLRequest initWithURL:]" -U -f com.robot.demo // Get app files and APIs
frida-trace -m "-[NSURL initWithString:]" -U -f com.robot.demo // find the API endpoints
frida-trace -m "*[NSURL absoluteString]" -U -f com.robot.demo // my favorite of these
Edit the Frida-Trace auto-generated, template file.
onEnter: function (log, args, state) {
log("-[NSURLRequest initWithURL:" + args[2] + "]");
var str = new ObjC.Object(ptr(args[2])).toString()
console.log('[*] ' , str);
},
// results
[*] https://secretserver.nl/SignIn
frida-trace -i "*strcpy" -f hitme aaaa bbbb
Instrumenting functions...
_platform_strcpy: Loaded handler at "/.../__handlers__/libSystem.B.dylib/_platform_strcpy.js"
Started tracing 1 function. Press Ctrl+C to stop.
Edit the auto-generated, template Javascript file.
-----------
onEnter: function (log, args, state) {
// strcpy() arg1 is the Source. arg0 is the Destination.
console.log('\n[+] _platform_strcpy()');
var src_ptr = args[1].toString()
var src_string = Memory.readCString(args[1]);
var src_byte_array = Memory.readByteArray(args[1],4);
var textDecoder = new TextDecoder("utf-8");
var decoded = textDecoder.decode(src_byte_array);
console.log('[+] src_ptr\t-> ' , src_ptr);
console.log('[+] src_string\t-> ' + src_string);
console.log('[+] src_byte_array\t-> ' + src_byte_array);
console.log('[+] src_byte_array size\t-> ' + src_byte_array.byteLength);
console.log('[+] src_byte_array decoded\t-> ' + decoded);
},
The results:
[+] _platform_strcpy()
[+] src_ptr -> 0x7ffeefbffaa6
[+] src_string -> aaaa
[+] src_byte_array -> [object ArrayBuffer]
[+] src_byte_array size -> 4
[+] decoded -> aaaa
[+] _platform_strcpy()
[+] src_ptr -> 0x7ffeefbffaab
[+] src_string -> bbbb
[+] src_byte_array -> [object ArrayBuffer]
[+] src_byte_array size -> 4
[+] decoded -> bbbb
frida-ps -Uai // get your bundle ID
frida --codeshare mrmacete/objc-method-observer -U -f $BUNDLE_ID
[+] At the Frida prompt...
// Method isJailbroken
observeSomething('*[* isJail*]')
// Observe String compares
observeSomething('*[* isEqualToString*]');
// A Class ( ObjC ) or Module (Symbol ). The first asterix indicates it can be eith Instance or Class method
observeSomething('*[ABC* *]');
// Watch Cookies
observeSomething('-[WKWebsiteDataStore httpCookieStore]');
observeSomething('-[WKWebAllowDenyPolicyListener *]');
// dump the URL to hit
observeSomething('-[WKWebView loadRequest:]');
// you get all HTML, js, css, etc
observeSomething('-[WKWebView load*]');
// Read the entire request
observeSomething('-[WKWebView loadHTMLString:baseURL:]')
// Check for a custom UserAgent
observeSomething('-[WKWebView *Agent]');
# Rename Frida process
bash -c "exec -a YDFooBar ./frida-server &"
# Set Frida-Server on host to a specific interface and port
frida-server -l 0.0.0.0:19999 &
# Call Frida-server from Host
frida-ps -ai -H 192.168.0.38:19999
# Trace on custom port
frida-trace -m "*[NSURLSession* *didReceiveChallenge*]" -H 192.168.0.38:19999 -f $BUNDLE_ID
/private/var/mobile/Containers/Data/Application/<app guid, given at install time>/Library/Cookies/Cookies.binarycookies
scp -P 2222 root@localhost:/private/var/mobile/Containers/Data/Application/<App GUID>/Library/Cookies/Cookies.binarycookies cookies.bin
BinaryCookieReader: Written By Satishb3 (http://www.securitylearn.net
python BinaryCookieReader.py Cookie.Binarycookies-FilePath
Cookie : s_fid=0BBD745EA9BCF67F-366EC6EDEFA2A0E6; domain=.apple.com; path=/; expires=Thu, 14 Dec 2023;
Cookie : s_pathLength=homepage%3D2%2C; domain=.apple.com; path=/; expires=Fri, 14 Dec 2018;
Cookie : s_vi=[CS]v1|2E09D702852E4ACE-60002D37A0008393[CE]; domain=.apple.com; path=/; expires=Sun, 13 Dec 2020;
............
............
$) ps -ax | grep -i WebKit.Networking
29163 ?? <longPath>/.../com.apple.WebKit.Networking
$) frida --codeshare mrmacete/objc-method-observer -p 29163
[PID::29163]-> %resume
[PID::29163]-> observeSomething('*[* cookiesWithResponseHeaderFields:forURL:]');
Results:
+[NSHTTPCookie cookiesWithResponseHeaderFields:forURL:]
cookiesWithResponseHeaderFields: {
"Set-Cookie" = "EuConsent=<removed for brevity>; path=/; expires=Sat, 16 Nov 2019 14:51:01 GMT;";
} (__NSSingleEntryDictionaryI)
forURL: https://uk.yahoo.com/?p=us&guccounter=1 (NSURL)
RET: (
"<NSHTTPCookie
version:0
name:EuConsent
value:<removed for brevity>
expiresDate:'2019-11-16 14:51:01 +0000'
created:'2019-11-15 14:51:01 +0000'
sessionOnly:FALSE
domain:yahoo.com
partition:none
sameSite:none
path:/
isSecure:FALSE
path:"/" isSecure:FALSE>"
)
WARNING: only change the minimum iOS version of a specific app's plist and not for the entire device. Things start to break - like calls into C libraries - when you change the device's read-only iOS version.
ssh onto device
root# cd /System/Library/CoreServices/
root# cat SystemVersion.plist
root# nano SystemVersion.plist
EDIT THE VALUE. KEEP THE OLD VALUE!
https://developer.apple.com/library/archive/qa/qa1964/_index.html
otool -l -arch all my_framework | grep __llvm_prf
nm -m -arch all my_app | grep gcov
Some notes, tools, and techniques for reverse engineering macOS binaries.
Binary Ninja is an interactive decompiler, disassembler, debugger, and binary analysis platform built by reverse engineers, for reverse engineers. Developed with a focus on delivering a high-quality API for automation and a clean and usable GUI, Binary Ninja is in active use by malware analysts, vulnerability researchers, and software developers worldwide. Decompile software built for many common architectures on Windows, macOS, and Linux for a single price, or try out our limited (but free!) Cloud version.
There are two ways to try Binary Ninja for free! Binary Ninja Cloud supports all architectures, but requires you to upload your binaries. Binary Ninja Free is a downloadable app that runs locally, but has architecture restrictions. Neither free option supports our powerful API / Plugin ecosystem.
Binary Ninja Cloud is our free, online reverse engineering tool.
Sidekick Makes Reverse Engineering Easy Don't open that binary alone! Take Sidekick, your AI-powered assistant, with you. Sidekick can help answer your questions about the binary, recover structures, name things, describe and comment code, find points of interest, and much more.
4.0: Dorsai
3.5: Expanded Universe
3.4: Finally Freed
3.3: The Bytes Must Flow
3.2 Release
3.1 The Performance Release
3.0 The Next Chapter
Hijacking the Binary Ninja UI for Fun and Profit
User Guide
Migrating from Other Tools
Getting Started
Migrating from IDA
Migrating from Ghidra
There's so many things to learn about working with Types in Binary Ninja that we've organized it into several sections!
Basic Type Editing: Brief overview of the basics
Basic Type Editing The biggest culprit of bad decompilation is often missing type information. Therefore, some of the most important actions you can take while reverse engineering is renaming symbols/variables, applying types, and creating new types to apply.
Working with Types: Interacting with types in disassembly and decompilation
Working with Types, Structures, and Symbols in Decompilation There are two main ways to interact with types in decompilation or disassembly. The first is to use the types view, and the second is to take advantage of the smart structures workflow or otherwise annotate types directly in a disassembly or IL view.
Importing/Exporting Types: How to import or export types from header files, archives, or other BNDBs
Importing Type Information Type information can be imported from a variety of sources. If you have header files, you can import a header. If your types exist in an existing BNDB, you can use import from a bndb. With the introduction of type archives we recommend migrating away from importing via BNDB to type archives as they allow types to remain synced between different databases.
Import BNDB File The Import BNDB File feature imports types from a previous BNDB into your currently open file. In addition, it will apply types for matching symbols in functions and variables. Import BNDB will not port symbols from a BNDB with symbols to one without -- the names must already match. Matching functions and porting symbols is beyond the scope of this feature.
Import Header File If you already have a collection of headers containing types you want to use, you can import them directly. You can specify the compiler flags that would be used if a compiler were compiling a source file that uses this header.
After specifying the file(s) and flag(s), pressing Preview will give a list of all the types and functions defined in the file(s). You may check or uncheck the box next to any of the types/functions to control whether they will be imported to your analysis.
Finding System Headers Since you need to specify the include paths for system headers, you will need to deduce them for the target platform of your analysis. Here are a few tricks that may help
Systems with GCC/Clang (macOS, Linux, etc) On these systems, you can run a command to print the default search path for compilation:
gcc -Wp,-v -E - clang -Wp,-v -E -
For the directories printed by this command, you should include them with
-isystem<path>
in the order specified.
⇒ gcc -Wp,-v -E -
clang -cc1 version 15.0.0 (clang-1500.3.9.4) default target x86_64-apple-darwin23.4.0
ignoring nonexistent directory "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/local/include"
ignoring nonexistent directory "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/Library/Frameworks"
#include "..." search starts here:
#include <...> search starts here:
/usr/local/include
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/15.0.0/include
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks (framework directory)
End of search list.
-isystem/usr/local/include
-isystem/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/15.0.0/include
-isystem/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include
-isystem/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include
Additionally, several types of containers for type information are documented here:
Debug Info: Debug Info can provide additional type information (examples include DWARF and PDB files)
Debug Info Debug Info is a mechanism for importing types, function signatures, and data variables from either the original binary (eg. an ELF compiled with DWARF) or a supplemental file (eg. a PDB).
Currently debug info plugins are limited to types, function signatures, and data variables, but in the future will include line number information, comments, local variables, and possibly more.
Type Libraries: Type Libraries contain types from commonly-used dynamic libraries
Type Libraries Type Libraries are collections of type information (structs, enums, function types, etc.), corresponding to specific dynamic libraries that are imported into your analysis. You can browse and import them in the Types View.
Most of your usage of Type Libraries will be performed automatically by Binary Ninja when you analyze a binary. They are automatically imported based on the libraries that your binary uses. Any library functions or global variables your binary references will have their type signature imported, and any structures those functions and variables reference are imported as well.
Platform Types: Types that automatically apply to a platform
Platform Types Binary Ninja pulls type information from a variety of sources. The highest-level source are the platform types loaded for the given platform (which includes operating system and architecture). There are two sources of platform types. The first are shipped with the product in a binary path. The second location is in your user folder and is intended for you to put custom platform types.
Platform types are used to define types that should be available to all programs available on that particular platform. They are only for global common types.
Type Archives: How you can use type archives to share types between analysis databases
Type Archives Type Archives are files you can use to share types between analysis databases. You can create them and manage their contents in Types View.
Signature Libraries: Signature libraries are used to match names of functions with signatures for code that is statically compiled
Signature Library While many signatures are built-in and require no interaction to automatically match functions, you may wish to add or modify your own. First, install the SigKit plugin from the plugin manager.
Once the signature matcher runs, it will print a brief report to the console detailing how many functions it matched and will rename matched functions.
To generate a signature library for the currently-open binary, use Tools > Signature Library > Generate Signature Library. This will generate signatures for all functions in the binary that have a name attached to them. Note that functions with automatically-chosen names such as sub_401000 will be skipped. Once it's generated, you'll be prompted where to save the resulting signature library.
Platform Types: Platform types are base types that apply to all binaries on a particular platform
Additionally, make sure to see the applying annotations section of the developer guide for information about using the API with types and covering the creation of many of the items described below.
Working with C++ Types and Virtual Function Tables
Virtual Function Tables Virtual functions are implemented by compilers using a virtual function table structure that is pointed to by instances of the class. The layout of these structures is compiler dependent, so when reverse engineering a C++ binary these structures must be created to match the in-memory layout used by the binary.
One of the most common tasks when reversing a C++ binary is discovering which functions a virtual function call can resolve to. Binary Ninja provides a "propagate data variable references" option in the Create Structure dialog to help with this. When this is enabled, pointers found in the data section that are part of any instance of the structure will be considered as cross references of the structure field itself. This allows you to click on the name of a virtual function and see which functions it can potentially call in the cross references view.
Devirtualizing C++ with Binary Ninja
Binary Ninja IL Example: Navigating to a Virtual Function Based on an Indirect Call
Objective-C (Beta) Recent version of Binary Ninja ship with an additional plugin for assisting with Objective-C analysis. It provides both a workflow and a plugin command for enhancing Objective-C binary analysis.
Binary Ninja plugin & workflow to help analyze Objective-C code
Debugger Binary Ninja Debugger is a plugin that can debug executables on Windows, Linux, and macOS, and more!
The debugger plugin is shipped with Binary Ninja. It is open-source under an Apache License 2.0. Bug reports and pull requests are welcome!
Binary Ninja debugger
Binary Ninja Intermediate Language: Overview
Reading IL All of the various ILs (with the exception of the SSA forms) are intended to be easily human-readable and look much like pseudo-code. There is some shorthand notation that is used throughout the ILs, though, explained below
User Informed Data Flow Binary Ninja now implements User-Informed DataFlow (UIDF) to improve the static reverse engineering experience of our users. This feature allows users to set the value of a variable and have the internal dataflow engine propagate it through the control-flow graph of the function. Besides constant values, Binary Ninja supports various PossibleValueSet states as containers to help inform complex variable values.
Binary Ninja Workflows Documentation
Binary Ninja Workflows is an analysis orchestration framework which simplifies the definition and execution of a computational binary analysis pipeline. The extensible pipeline accelerates program analysis and reverse engineering of binary blobs at various levels of abstraction. Workflows supports hybridized execution models, where the ordering of activities in the pipeline can be well-known and procedural, or dynamic and reactive. Currently, the core Binary Ninja analysis is made available as a procedural model and is the aggregate of both module and function-level analyses.
Writing Plugins
UI Plugins
Hijacking the Binary Ninja UI for Fun and Profit
Taking Action With the Command Palette
Add support for popups (not message boxes) in the API
Making a new window or making the tooltips arbitrarily configurable is actually already possible but using the QT API.
Here's an example of tooltip code you can use
Some extra context for the code above: It doesn't use an "actual" Qt Tooltip because tooltips can't be an 'active' window (see documentation). This means they can't respond to things like key-press events, even if you try to catch the event yourself. As a result, the code above fakes one with
Qt.FramelessWindowHint
andQt.WindowStaysOnTopHint
.
Using the Binary Ninja API
Binary Ninja Python API Reference
Binary Ninja C++ API
UIAction.registerAction
: https://api.binary.ninja/cpp/group__action.html#af92fcf662c19e708e006bfc91756183aPublic API, examples, documentation and issues for Binary Ninja
Cookbook
Ninjas In Training List of resources for folks beginning their journey into reverse engineering. If appropriate, resources are labelled as B for Beginner, I for Intermediate, and A for Advanced. Feel free to join the slack and hop in the
#ninjas-in-training
channel for specific questions.
[docs] Improve 'Finding System Headers' examples for C++
Support parsing C++ templates (at least concrete specializations)
Templatized types (in some form) in BN's type system
Demangled type names reference missing types
macOS Type Libraries
iOS Type Libraries
Show vtable function calls as normal function call
With the addition of
__data_var_refs
(see: C++ Types user docs) this is effectively solved. The next step is to automate the parsing and symbolizing of both MSVC (#3930) and Itanium RTTI (#3857).
MSVC RTTI analysis
GCC/Clang RTTI analysis
Incorrect
typeinfo_name_for
definitions in mach-o binaries
Official Binary Ninja Plugins
Binary Ninja Community Plugins
Binary Ninja Community Themes
A software reverse engineering (SRE) suite of tools developed by NSA's Research Directorate in support of the Cybersecurity mission
This (completely!) free version of IDA offers a privilege opportunity to see IDA in action. This light but powerful tool can quickly analyze the binary code samples and users can save and look closer at the analysis results.
IDA Home was introduced thanks to the experience Hex-Rays has been gaining throughout the years to propose hobbyists a solution that combines rapidity, reliability with the levels of quality and responsiveness of support that any professional reverse engineers should expect.
IDA Pro as a disassembler is capable of creating maps of their execution to show the binary instructions that are actually executed by the processor in a symbolic representation (assembly language). Advanced techniques have been implemented into IDA Pro so that it can generate assembly language source code from machine-executable code and make this complex code more human-readable.
The debugging feature augmented IDA with the dynamic analysis. It supports multiple debugging targets and can handle remote applications. Its cross-platform debugging capability enables instant debugging, easy connection to both local and remote processes and support for 64-bit systems and new connection possibilities.
This writeup is now deprecated. Please see this resource instead.
Debugging Mac OSX Applications with IDA Pro
r2papi Typescript APIs for radare2
The r2papi module implements a set of idiomatic and high-level APIs that are based on top of the minimalistic r2pipe API.
Interface R2Pipe Generic interface to interact with radare2, abstracts the access to the associated instance of the tool, which could be native via rlang or remote via pipes or tcp/http.
Class R2Shell Class that interacts with the r2ai plugin (requires rlang-python and r2i r2pm packages to be installed). Provides a way to script the interactions with different language models using javascript from inside radare2.
Class that interacts with the
r2ai
plugin (requiresrlang-python
andr2i
r2pm packages to be installed). Provides a way to script the interactions with different language models using javascript from inside radare2.
Radare2: Libre Reversing Framework for Unix Geeks
UNIX-like reverse engineering framework and command-line toolset
r2
is a complete rewrite of radare. It provides a set of libraries, tools and plugins to ease reverse engineering tasks. Distributed mostly under LGPLv3, each plugin can have different licenses (seer2 -L
,rasm2 -L
, ...). The radare project started as a simple command-line hexadecimal editor focused on forensics. Today,r2
is a featureful low-level command-line tool with support for scripting with the embedded Javascript interpreter or viar2pipe
.
r2
can edit files on local hard drives, view kernel memory, and debug programs locally or via a remote gdb server.r2
's wide architecture support allows you to analyze, emulate, debug, modify, and disassemble any binary.
Using the
r2pm
tool you can browse and install many plugins and tools that use radare2.
- esilsolve: The symbolic execution plugin, based on esil and z3
- iaito: The official Qt graphical interface
- keystone Assembler instructions using the Keystone library
- r2ai Run a Language Model in localhost with Llama inside r2!
- r2dec: A decompiler based on r2 written in JS, accessed with the
pdd
command- r2diaphora: Diaphora's diffing engine working on top of radare2
- r2frida: The frida io plugin. Start r2 with
r2 frida://0
to use it- r2ghidra: The native ghidra decompiler plugin, accessed with the
pdg
command- r2papi High level api on top of r2pipe
- r2pipe Script radare2 from any programming language
- r2poke Integration with GNU/Poke for extended binary parsing capabilities
- r2yara Run Yara from r2 or use r2 primitives from Yara
- radius2: A fast symbolic execution engine based on boolector and esil
Official QT frontend of radare2
Access radare2 via pipe from any programming language!
The
r2pipe
APIs are based on a singler2
primitive found behindr_core_cmd_str()
which is a function that accepts a string parameter describing ther2
command to run and returns a string with the result.The decision behind this design comes from a series of benchmarks with different libffi implementations and resulted that using the native API is more complex and slower than just using raw command strings and parsing the output.
As long as the output can be tricky to parse, it's recommended to use the JSON output and deserializing them into native language objects which results much more handy than handling and maintaining internal data structures and pointers.
Radare2 official book
Radare2 wiki This is an ongoing work in progress and reflects various material obtained while stuying how to use radare2.
radare2 – Advanced command-line hexadecimal editor, disassembler and debugger
rabin2
– Binary program info extractor
This program allows you to get information about ELF/PE/MZ and CLASS files in a simple way.
rafind2
– advanced command-line byte pattern search in files
rafind2
is a program to find byte patterns in files
radiff2
– unified binary diffing utility
radiff2
implements many binary diffing algorithms for data and code.
After installing
xdot
, you can graph the difference between two binaries. Syntax is,radiff2 -g function_name binary1 binary | xdot -
Yellow indicates some offsets doesnt match, grey is perfect match and red shows a strong difference
rarun2
—radare2
utility to run programs in exotic environments
This program is used as a launcher for running programs with different environment, arguments, permissions, directories and overridden default file descriptors.
You can preload
r2
inside a process. This is similar tor2frida
but native implementation Example:rarun2 r2preload=yes program=/bin/cat
followed by the kill command thatrarun2
generates
rasm2
– radare2 assembler and disassembler tool
This tool uses
r_asm
to assemble and disassemble files or hexpair strings. It supports a large list of architectures which can be listed using the-L
flag.
Valid architecture and cpu
rasm2 -L
(list of valid architectures and bits)
rahash2
– block based hashing utilityRahash2 allows you to calculate, check and show the hash values of each block of a target file. The block size is 32768 bytes by default. It's allowed to hash from stdin using '-' as a target file. You can compare against a known hash and get the result in the exit status.
You can hash big files by hashing each block and later determine what part of it has been modified. Useful for filesystem analysis.
This command can be used to calculate hashes of a certain part of a file or a command line passed string.
WARNING - Do not try to use rahash2 on a big file as it attempts to load the entire file in memory first.
rax2
– radare base converter
This command allows you to convert values between positive and negative integer, float, octal, binary and hexadecimal values.
r2pm
(radare2 package manager)
Allows to install, update, uninstall and discover plugins and tools that can be used with radare2.
To debug a MachO file using
r2
, runr2
withsudo
. Otherwise you need to sign r2
A book on learning radare2
The goal of this book is to accommodate the reader with radare2, which is quickly becoming a bread & butter tool for any reverse engineer, malware analyst or biweekly CTF player. It is not meant to replace the Radare2 Book, but rather to complement it.
Radare2 Wiki The goal of this wiki is to make a searchable collection of documents which can be used to find various use cases and help regarding using r2.
r2ghidra This is an integration of the Ghidra decompiler for radare2. It is solely based on the decompiler part of Ghidra, which is written entirely in C++, so Ghidra itself is not required at all and the plugin can be built self-contained.
local language model for radare2
Radare2 Script Skeletons This repository contains directories that can be used as an skeleton or template to start writing your projects that use radare2.
Radare2 can be extended in many ways:
- Scripts
- Plugins
- Programs
And it is possible to use almost any languages to do so. Those templates include the .vscode and build files too. so you can quickly start doing real work!
Dynamic instrumentation toolkit for developers, reverse-engineers, and security researchers.
Scriptable Inject your own scripts into black box processes. Hook any function, spy on crypto APIs or trace private application code, no source code needed. Edit, hit save, and instantly see the results. All without compilation steps or program restarts.
Portable Works on Windows, macOS, GNU/Linux, iOS, watchOS, tvOS, Android, FreeBSD, and QNX. Install the Node.js bindings from npm, grab a Python package from PyPI, or use Frida through its Swift bindings, .NET bindings, Qt/Qml bindings, Go bindings, or C API. We also have a scalable footprint.
Free Frida is and will always be free software (free as in freedom). We want to empower the next generation of developer tools, and help other free software developers achieve interoperability through reverse engineering.
Battle-tested We are proud that NowSecure is using Frida to do fast, deep analysis of mobile apps at scale. Frida has a comprehensive test-suite and has gone through years of rigorous testing across a broad range of use-cases.
Dynamic instrumentation toolkit for developers, reverse-engineers, and security researchers.
Clone this repo to build Frida
Frida core library intended for static linking into bindings
Cross-platform instrumentation and introspection library written in C
This library is consumed by frida-core through its JavaScript bindings, GumJS.
Frida Node.js bindings
Frida Python bindings
Frida Swift bindings
Frida Rust bindings
medusa Binary instrumentation framework based on FRIDA
MEDUSA is an extensible and modularized framework that automates processes and techniques practiced during the dynamic analysis of Android and iOS Applications.
Collection of useful FRIDA Mobile Scripts
Observer Security Bypass Static Analysis Specific Software Other
Category: Data structures
Category: Debugging & Memory
Virtual Function Tables Virtual functions are implemented by compilers using a virtual function table structure that is pointed to by instances of the class. The layout of these structures is compiler dependent, so when reverse engineering a C++ binary these structures must be created to match the in-memory layout used by the binary.
C++ vtables - Part 1 - Basics
In this mini post-series we’ll explore how clang implements vtables & RTTI. In this part we’ll start with some basic classes and later on cover multiple inheritance and virtual inheritance.
NonVirtualClass
has a size of 1 because in C++ classes can’t have zero size. However, this is not important right now.
VirtualClass
’s size is8
on a 64 bit machine. Why? Because there’s a hidden pointer inside it pointing to avtable
.vtable
s are static translation tables, created for each virtual-class.
Here’s what we learned from the above:
- Even though the classes have no data members, there’s a hidden pointer to a
vtable
;vtable
forp1
andp2
is the same.vtable
s are static data per-type;d1
andd2
inherit avtable
-pointer fromParent
which points toDerived
’s vtable;- All
vtable
s point to an offset of16
(0x10
) bytes into the vtable. We’ll also discuss this later.
Note: we’re looking at demangled symbols. If you really want to know,
_ZTV
is a prefix forvtable
,_ZTS
is a prefix for type-string (name) and_ZTI
is for type-info.
Here's
Parent
's vtable layout:
Address Value Meaning 0x400ba8 0x0 top_offset
(more on this later)0x400bb0 0x400b78 Pointer to typeinfo for Parent
(also part of the above memory dump)0x400bb8 0x400aa0 Pointer to Parent::Foo()
^1^.Parent
's _vptr points here.0x400bc0 0x400a90 Pointer to Parent::FooNotOverridden()
^2^Here's
Derived
's vtable layout:
Address Value Meaning 0x400b40 0x0 top_offset
(more on this later)0x400b48 0x400b90 Pointer to typeinfo for Derived
(also part of the above memory dump)0x400b50 0x400a80 Pointer to Derived::Foo()
^3^.Derived
's _vptr points here.0x400b58 0x400a90 Pointer to Parent::FooNotOverridden()
(same asParent
's)
Remember that the
vtable
pointer in Derived pointed to a+16
bytes offset into thevtable
? The 3rd pointer is the address of the first method pointer. Want the 3rd method? No problem - add2 * sizeof(void*)
tovtable
-pointer. Want thetypeinfo
record? jump to the pointer before.
C++ vtables - Part 2 - Multiple Inheritance
The world of single-parent inheritance hierarchies is simpler for the compiler. As we saw in Part 1, each child class extends its parent vtable by appending entries for each new virtual method. In this post we will cover multiple inheritance, which complicates things even when only inheriting from pure-interfaces.
C++ vtables - Part 3 - Virtual Inheritance
C++ vtables - Part 4 - Compiler-Generated Code
The Secret Life of C++: What Your Compiler Doesn't Want You To Know
C++ is filled with strange and wonderful features, with a few more added in C++11. We will explore in detail how these features are implemented under the covers, in terms of the assembly code generated. Features to be explored include construction and destruction, copying, references, virtual methods, method dispatch, object layout, exceptions, templates, anonymous functions, captures, and more.
Hour One: The Not So Secret Life of C, and crash course in x86_64 assembly
Hour Two
References
Symbol Mangling
Objects, Methods, Inheritance, Copying, References, Methods
Runtime Time Type Information and Casting
Hour Three
Virtual Inheritance Review
Initializing Global Objects
Exceptions
Hour Four
Syntactic Sugar
Templates
Anonymous Functions, Captures
Static Analysis of C++ Virtual Tables (from GCC) James Rowley, Marcus Engineering, LLC Hardwear.io USA 2023
- Structures and C++
Virtual function arrays The implementation of virtual function tables depends on the base classes of a given class. It is also often compiler specific A good demo can be found at:
VTable Notes on Multiple Inheritance in GCC C++ Compiler v4.0.1
Chapter 12. C++ Class Representations
Exploring
std::string
Every C++ developer knows that
std::string
represents a sequence of characters in memory. It manages its own memory, and is very intuitive to use. Today we’ll explorestd::string
as defined by the C++ Standard, and also by looking at 4 major implementations.
One particular optimization found its way to pretty much all implementations: small objects optimization (aka small buffer optimization). Simply put, Small Object Optimization means that the
std::string
object has a small buffer for small strings, which saves dynamic allocations.
Recent GCC versions use a union of buffer (16 bytes) and capacity (8 bytes) to store small strings. Since
reserve()
is mandatory (more on this later), the internal pointer to the beginning of the string either points to this union or to the dynamically allocated string.
clang
is by-far the smartest and coolest. Whilestd::string
has the size of 24 bytes, it allows strings up to 22 bytes(!!) with no allocation. To achieve this libc++ uses a neat trick: the size of the string is not saved as-is but rather in a special way: if the string is short (< 23 bytes) then it storessize() * 2
. This way the least significant bit is always 0. The long form always bitwise-ors the LSB with 1, which in theory might have meant unnecessarily larger allocations, but this implementation always rounds allocations to be of form16*n - 1
(wheren
is an integer). By the way, the allocated string is actually of form16*n
, the last character being'\0'
Memory Layout of
std::string
Discover how
std::string
is represented in the most popular C++ Standard Libraries, such as MSVC STL, GCC libstdc++, and LLVM libc++.
In this post of Tasty C++ series we’ll look inside of
std::string
, so that you can more effectively work with C++ strings and take advantage and avoid pitfalls of the C++ Standard Library you are using.
In C++ Standard Library, std::string is one of the three contiguous containers (together with
std::array
andstd::vector
). This means that a sequence of characters is stored in a contiguous area of the memory and an individual character can be efficiently accessed by its index atO(1)
time. The C++ Standard imposes more requirements on the complexity of string operations, which we will briefly focus on later in this post.
If we are talking about the C++ Standard, it’s important to remember that it doesn’t impose exact implementation of
std::string
, nor does it specify the exact size ofstd::string
. In practice, as we’ll see, the most popular implementations of the C++ Standard Library allocate 24 or 32 bytes for the same std::string object (excluding the data buffer). On top of that, the memory layout of string objects is also different, which is a result of a tradeoff between optimal memory and CPU utilization, as we’ll also see below.
For people just starting to work with strings in C++,
std::string
is usually associated with three data fields:
- Buffer – the buffer where string characters are stored, allocated on the heap.
- Size – the current number of characters in the string.
- Capacity – the max number of character the buffer can fit, a size of the buffer.
Talking C++ language, this picture could be expressed as the following class:
class TastyString { char * m_buffer; // string characters size_t m_size; // number of characters size_t m_capacity; // m_buffer size }
This representation takes 24 bytes and is very close to the production code.
std::string
implementation in GCC and its memory overhead for short strings
At least with GCC 4.4.5, which is what I have handy on this machine,
std::string
is atypdef
forstd::basic_string<char>
, andbasic_string
is defined in/usr/include/c++/4.4.5/bits/basic_string.h
. There's a lot of indirection in that file, but what it comes down to is that nonemptystd::string
s store a pointer to one of these:struct _Rep_base { size_type _M_length; size_type _M_capacity; _Atomic_word _M_refcount; };
Followed in-memory by the actual string data. So
std::string
is going to have at least three words of overhead for each string, plus any overhead for having a higher capacity thanlength
(probably not, depending on how you construct your strings -- you can check by asking thecapacity()
method).There's also going to be overhead from your memory allocator for doing lots of small allocations; I don't know what GCC uses for C++, but assuming it's similar to the
dlmalloc
allocator it uses for C, that could be at least two words per allocation, plus some space to align the size to a multiple of at least 8 bytes.
Layout of
std::vector
(libstdc++
)
Building a Universal macOS Binary
Create macOS apps and other executables that run natively on both Apple silicon and Intel-based Mac computers.
To create a universal binary for your project, merge the resulting executable files into a single executable binary using the lipo tool.
lipo -create -output universal_app x86_app arm_app
Determine Whether Your Binary Is Universal To users, a universal binary looks no different than a binary built for a single architecture. When you build a universal binary, Xcode compiles your source files twice—once for each architecture. After linking the binaries for each architecture, Xcode then merges the architecture-specific binaries into a single executable file using the
lipo
tool. If you build the source files yourself, you must calllipo
as part of your build scripts to merge your architecture-specific binaries into a single universal binary.To see the architectures present in a built executable file, run the
lipo
orfile
command-line tools. When running either tool, specify the path to the actual executable file, not to any intermediate directories such as the app bundle. For example, the executable file of a macOS app is in theContents/MacOS/
directory of its bundle. When running thelipo
tool, include the-archs
parameter to see the architectures.
% lipo -archs /System/Applications/Mail.app/Contents/MacOS/Mail
x86_64 arm64
To obtain more information about each architecture, pass the
-detailed_info
argument tolipo
.
Specify the Launch Behavior of Your App For universal binaries, the system prefers to execute the slice that is native to the current platform. On an Intel-based Mac computer, the system always executes the x86_64 slice of the binary. On Apple silicon, the system prefers to execute the arm64 slice when one is present. Users can force the system to run the app under Rosetta translation by enabling the appropriate option from the app’s Get Info window in the Finder.
If you never want users to run your app under Rosetta translation, add the LSRequiresNativeExecution key to your app’s Info.plist file. When that key is present and set to YES, the system prevents your app from running under translation. In addition, the system removes the Rosetta translation option from your app’s Get Info window. Don’t include this key until you verify that your app runs correctly on both Apple silicon and Intel-based Mac computers.
If you want to prioritize one architecture, without preventing users from running your app under translation, add the LSArchitecturePriority key to your app’s Info.plist file. The value of this key is an ordered array of strings, which define the priority order for selecting an architecture.
lipo
Create or operate on a universal file: convert a universal binary to a single architecture file, or vice versa.
lipo
produces one output file, and never alters the input file.
lipo
can: list the architecture types in a universal file; create a single universal file from one or more input files; thin out a single universal file to one specified architecture type; and extract, replace, and/or remove architectures types from the input file to create a single new universal output file.
LIPO This
lipo
is designed to be compatible with macOSlipo
, which is a utility for creating Universal Binary as known as Fat Binary.
If you do not have a source, the you can patch the dll at the entry point with
0xcc
for break point or0xEB 0xFE
(jmp 0x0
) for endless loop. In the case of break point opcode, you will trigger debugger on execution.
I'm curious on how the process looks like when reverse engineering Audio VST Plugins? They're
*.dll
-files and therefore not as straightforward as*.exe
-file applications which means that you can't attach to them directly via x64dbg for example, but you have to attach the debugger to the host or DAW I guess.
I’ve reverse engineered a vst that was no longer able to be purchased and such was unusable. I loaded it in FL Studio in 32 bit mode and used x32dbg. Also used IDA to map out functions and such. Good luck
Just write a DLL loader and call the function you're interested in debugging.
I'd explicitly load the library with LoadLibrary, get the function pointer with GetProcAddress and then call the function to maximize flexibility and control.
Depending on what the debugger you use can actually do and whether it can work around ASLR, you can put the breakpoint at the main program (where it's not randomized) at the function call rather than in the dll which may be anywhere in the address space.
Debugging VST Audio Plug In´s
For educational purposes I am trying to reverse engineer VST audio plug ins. Basically these are .dll files hosted by an audio application.
How can I debug them? I tried OllyDBG to load the .dll which works fine. When I do "Call DLL Export" and choose "VSTPluginMain" which is the entry point for every VST plugin, i get an access violation.
I also tried to load the vst host (Ableton Live) and trying to start the VST, but this source is not loaded in ollyDBG. I am a bit confused. Can you point me in the correct direction?
I need a technique to debug these plugin .dll´s.
Get the plugin SDK from Steinberg and study it. You'll notice that you need to pass address of your own audioMasterCallback function to VSTPluginMain().
Either make your own VST host, or find one which you can debug.
I have found the minihost in the Steinberg SDK. I used it to load a vst and now I can debug my host and inside ollyDbg I can now debug the vst.
VST 3 Developer Portal
Change History
VST 3 Plug-In SDK
Adding VST2 version The VST 2 SDK is not part anymore of the VST 3 SDK, you have to use an older version of the SDK and copy the
vst2sdk
folder into theVST_SDK
folder. In order to build a VST2 version of the plug-in and a VST3 at the same time, you need to copy the VST2 folder into the VST3 folder
Make the VST2 SDK available
Would it possible to make the VST2 SDK available on GitHub as a separate repo? This would make certail users' lives a lot easier.
VST2 SDK is not anymore supported. https://www.steinberg.net/en/newsandevents/news/newsdetail/article/vst-2-coming-to-an-end-4727.html
there is already a reverse engineered VST2 header since 2006 called VeSTige. It allows you to build a to-spec VST2 plugin.
It doesn't have any of the SDK features of course, but that's not really the point here (the SDK is probably the worst part about all of VST*, that and being completely closed to any revisions).
You can find examples of projects using vestige all over the place https://github.com/x42/lv2vst/blob/master/include/vestige.h?rgh-link-date=2021-02-24T18%3A31%3A13Z
A plugin from Github that I am working with (the GLSL Plugin) says that "The VST2 SDK can be obtained from the
vstsdk3610_11_06_2018_build_37
(or older) VST3 SDK..." However, there seems to be noVST_2
insteinbergmedia
/vst3dk
's code, files, etc. Are they missing, or are they not supposed to be in there?
VST2 SDK is not anymore available.
vstsdk367_03_03_2017_build_352
, which I believe already had the VST2SDK removedYou can download the last version of the VST3 SDK that includes the full VST2 SDK here.
Unpack the contents of the "VST3 SDK" subfolder from that archive into this folder. This means that folders like "plugininterface" and "public.sdk" should be located next to this file.
"Category : VST 2.x Interfaces" "Filename : pluginterfaces/vst2.x"
The VST2 SDK can be obtained from the
vstsdk3610_11_06_2018_build_37
(or older) VST3 SDK or JUCE version5.3.2
Removed the embedded VST2 SDK
VST3_SDK
VST3_SDK/pluginterfaces/vst2.x
Workaround
----------
1. The VST2 SDK can be obtained from the vstsdk3610_11_06_2018_build_37 (or
older) VST3 SDK or JUCE version 5.3.2. You should put the VST2 SDK in your
header search paths.
VST 3 Project Generator
Hello World VST 3 Hello World VST 3 example plug-in
This is a simple Hello World VST 3 FX plug-in to demonstrate how to use the VST 3 SDK as an external project.
VSTGUI A user interface toolkit mainly for audio plug-ins
VSTGUI is a user interface toolkit mainly for audio plug-ins (VST, AAX, AudioUnit, etc...) and it is designed for working well with VST 3 plug-ins and its wrappers like AU, AAX, ...
VST 3 Implementation Helper Classes And Examples
VST3 C API
The C API header of the VST3 API
This repository contains the VST3 C API.
It is automatically generated out of the C++ VST3 API (See https://github.com/steinbergmedia/vst3_c_api_generator)
VST 3 SDK Interfaces Here are located all VST 3 interfaces definitions (including VST Component/Controller, UI, Test).
A lightweight VST2/3 framework
Remove support for VST2
This project contains a "Hello World" style application for building a VST 2.4 plugin
removed vst2 files for licensing reasons
A Beginner's Guide to VST 2.4
Why VST 2.4? It is fair to wonder why (as of this writing in early 2018), I chose to create a project about VST 2.4. The reason is actually quite simple. When steinberg released 3.0, they decided to make it not backward compatible. 2.4 is the last version released of the 2.x branch. Due to the immense popularity of the format, and the fact that (at least at the time), the 3.x branch was not bringing enough to justify porting all the plugins to the new version, the format has continued to strive. Even despite steinberg officially ending its support at the end of 2013, the format has continued to strive.
For licensing reasons, you need to download the VST SDK from steinberg (3.6.8 as of 2018/01/01)
You can no longer download the VST 2.4 SDK and instead you have to download the VST 3 SDK, but it contains the 2.4 version. Also, although the VST3 SDK is open source (under a dual licensing including GPL3), version 2.4 is explicitly excluded so you need to get it yourself.
Anatomy of the SDK The entire VST 2.4 SDK is comprised of 2 files:
aeffect.h
andaeffectx.h
located underVST2_SDK/pluginterfaces/vst2.x
in the downloaded archive. There is no dependency on anything else so in order to build and compile a basic VST 2.4 plugin, you only need to includeaeffectx.h
since it includes the other file.
aeffectx.h
contains extensions added after version 1.0.
Plugin Lifecycle
Based entirely on convention, the host will locate the plugin executable (packaging and location depends on platform). The host will then look for a function with the following signature (by convention):
AEffect *VSTPluginMain(audioMasterCallback vstHostCallback)or if you prefer with more readable/meaningful types:
typedef audioMasterCallback VSTHostCallback; typedef AEffect VSTPlugin; VSTPlugin *VSTPluginMain(VSTHostCallback vstHostCallback);This function, written by the plugin developer acts as a factory of plugins:
- it receives the host callback which is used by the plugin to communicate with the host (for example to get the sample rate (opCode
audioMasterGetSampleRate
))- it returns an instance of the
AEffect
(akaVSTPlugin
) structure defined by the APIThis structure contains information that the host requires, like the number of inputs and outputs as well as 5 function pointers which defines the callbacks that the host will use to interact with the plugin.
Example of a self contained VST3/VST2 plugin that is not part of the SDK
This project is exactly the again example that ships part of the VST3 SDK but self contained and depending on the SDK (vs being part of the SDK). As a result it can be used as a starting point to build other plugins.
Note 2020/01 This project was created in 2018 using the VST3 SDK 3.6.9 which includes VST2. More recent versions of the SDK have removed VST2 support. Although this project is still valid as long as you use 3.6.9, you should check Jamba which offers a very easy way to bootstrap a blank self contained plugin which depends on the SDK. Jamba also offers a lot of additional features.
CMake project for VST SDK 2.4
VST audio plugin GUI minihost for Linux
This is the
minihost
example from Steinberg VSTSDK-2.4, with added support for building on Linux
Minihost Modular is a modular environment for hosting/interconnecting VST/AU plugins based on a custom modular engine especially developed for this purpose. As a standalone, Minihost Modular can be used as an advanced VST or AU host with modular routing with some sequencing recording/playback capabilities. As a VST or AU plugin, Minihost Modular can be used to extend the capabilities of your existing DAW software with its powerful modular, recallable, environment. Minihost Modular bares some similarities to FL Studio's Patcher but has an extended capability as a self contained host.
Symbiosis is a developer tool for adapting Mac OS X VST plug-ins to the Audio Unit (AU) standard.
JUCE is an open-source cross-platform C++ application framework for desktop and mobile applications, including VST, VST3, AU, AUv3, LV2 and AAX audio plug-ins.
Mod/Div Deoptimization
One of the many things compilers do that can make reverse engineering harder is use a variety of algorithmic optimizations, in particular for modulus and division calculations. Instead of implementing them with the native CPU instructions, they will use shifts and multiplications with magic constants that when operating on a fixed integer size has the same effect as a native division instruction.
There are several ways to try to recover the original division which is far more intuitive and easer to reason about.
Fast divisionless computation of binomial coefficients
We would prefer to avoid divisions entirely. If we assume that k is small, then we can just use the fact that we can always replace a division by a known value with a shift and a multiplication. All that is needed is that we precompute the shift and the multiplier. If there are few possible values of k, we can precompute it with little effort.
I provide a full portable implementation complete with some tests. Though I use C, it should work as-is in many other programming languages. It should only take tens of CPU cycles to run. It is going to be much faster than implementations relying on divisions.
Another trick that you can put to good use is that the binomial coefficient is symmetric: you can replace k by n–k and get the same value. Thus if you can handle small values of k, you can also handle values of k that are close to n. That is, the above function will also work for n is smaller than 100 and k larger than 90, if you just replace k by n–k.
Is that the fastest approach? Not at all. Because n is smaller than 100 and k smaller than 10, we can precompute (memoize) all possible values. You only need an array of 1000 values. It should fit in 8kB without any attempt at compression. And I am sure you can make it fit in 4kB with a little bit of compression effort. Still, there are instances where relying on a precomputed table of several kilobytes and keeping them in cache is inconvenient. In such cases, the divisionless function would be a good choice.
Alternatively, if you are happy with approximations, you will find floating-point implementations.
Apple Internals This repository provides tools and information to help understand and analyze the internals of Apple’s operating system platforms.
Collected knowledge about the internals of Apple’s platforms.
Sorted by keyword, abbreviation, or codename.
Source Browser - objc4
objc4-274
objc-exports
# Functions and variables explicitly exported from ObjC.
# GrP 2002-2-4
# Note that some commonly used functions are *not* listed in the
# ObjC headers (e.g. objc_flush_caches())
libobjc.order
objc4-818.2
Modular binary injection framework, successor of libhooker
ezinject is a lightweight and flexible binary injection framework. it can be thought as a lightweight and less featured version of frida.
It's main and primary goal is to load a user module (.dll, .so, .dylib) inside a target process. These modules can augment ezinject by providing additional features, such as hooks, scripting languages, RPC servers, and so on. They can also be written in multiple languages such as C, C++, Rust, etc... as long as the ABI is respected.
NOTE: ezinject core is purposedly small, and only implements the "kernel-mode" (debugger) features it needs to run the "user-mode" program, aka the user module.
It requires no dependencies other than the OS C library (capstone is optionally used only by user modules)
Porting ezinejct is simple: No assembly code is required other than a few inline assembly statements, and an abstraction layer separates multiple OSes implementations.
ElleKit yet another tweak injector / tweak hooking library for darwin systems
What this is
- A C function hooker that patches memory pages directly
- An Objective-C function hooker
- An arm64 assembler
- A JIT inline assembly implementation for Swift
- A Substrate and libhooker API reimplementation
Diaphora A Free and Open Source Program Diffing Tool
Diaphora (διαφορά, Greek for 'difference') version 3.0 is the most advanced program diffing tool (working as an IDA plugin) available as of today (2023). It was released first during SyScan 2015 and has been actively maintained since this year: it has been ported to every single minor version of IDA since 6.8 to 8.3.
Diaphora supports versions of IDA >= 7.4 because the code only runs in Python 3.X (Python 3.11 was the last version being tested).
Diaphora, the most advanced Free and Open Source program diffing tool.
Diaphora has many of the most common program diffing (bindiffing) features you might expect, like:
- Diffing assembler.
- Diffing control flow graphs.
- Porting symbol names and comments.
- Adding manual matches.
- Similarity ratio calculation.
- Batch automation.
- Call graph matching calculation.
- Dozens of heuristics based on graph theory, assembler, bytes, functions' features, etc...
However, Diaphora has also many features that are unique, not available in any other public tool. The following is a non extensive list of unique features:
- Ability to port structs, enums, unions and typedefs.
- Potentially fixed vulnerabilities detection for patch diffing sessions.
- Support for compilation units (finding and diffing compilation units).
- Microcode support.
- Parallel diffing.
- Pseudo-code based heuristics.
- Pseudo-code patches generation.
- Diffing pseudo-codes (with syntax highlighting!).
- Scripting support (for both the exporting and diffing processes).
radare2
's rafind2
SEARCH_DIRECTORY="./path/to/bins"
GREP_PATTERN='\x5B\x27\x21\x3D\xE9'
# Remove all instances of '\x' from PATTERN for rafind2
# Eg. Becomes 5B27213DE9
PATTERN="${GREP_PATTERN//\\x/}"
grep -rl "$GREP_PATTERN" "$SEARCH_DIRECTORY" | while read -r file; do
echo "$file:"
rafind2 -x "$PATTERN" "$file"
done
SEARCH_DIRECTORY="./path/to/bins"
PATTERN='5B27213DE9'
# Using find
find "$SEARCH_DIRECTORY" -type f -exec sh -c 'output=$(rafind2 -x "$1" "$2"); [ -n "$output" ] && echo "$2:" && echo "$output"' sh "$PATTERN" {} \;
# Using fd
fd --type f --exec sh -c 'output=$(rafind2 -x "$1" "$2"); [ -n "$output" ] && (echo "$2:"; echo "$output")' sh "$PATTERN" {} "$SEARCH_DIRECTORY"
⇒ time ./test-grep-and-rafind2
# ..snip..
./test-grep-and-rafind2 7.33s user 0.19s system 99% cpu 7.578 total
⇒ time ./test-find-and-rafind2
# ..snip..
./test-find-and-rafind2 3.24s user 0.72s system 98% cpu 4.041 total
⇒ time ./test-fd-and-rafind2
# ..snip..
./test-fd-and-rafind2 3.85s user 1.04s system 488% cpu 1.002 total
__next_f
, etc) (0xdevalias' gist))