Most of computer science is about providing functionality: - User Interface
- Software Design
- Algorithms
- Operating Systems/Networking
- Compilers/PL
- Microarchitecture
- VLSI/CAD
Computer security is not about functionality It is about how the embodiment of functionality behaves in the presence of an adversary Security mindset – think like a bad guy
Collaborative Center for Internet Epidemiology and Defenses (CCIED) Collaborative Center for Internet Epidemiology and Defenses (CCIED) - UCSD/ICSI group created in response to worm threat
- Very well funded, many strong partners
Goals - Internet epidemiology: measuring/understanding attacks
- Automated defenses: stopping outbreaks/attacks
- Economic and legal issues: that other stuff
50+ papers, lots of tech transfer, big sytems, etc 50+ papers, lots of tech transfer, big sytems, etc Network Telescope - Passive monitor for > 1% of routable Internet addr space
Potemkin & GQ Honeyfarms - Active VM honeypot servers on >250k IP addresses
Earlybird - On-line learning of new worm signatures in < 1ms
-
We didn’t stop Internet worms, let alone malware, let alone cybercrime… nor did anyone else. We didn’t stop Internet worms, let alone malware, let alone cybercrime… nor did anyone else. At best, moved it around a bit. By any meaningful metric the bad guys are winning… Mistake: looking at this solely as a technical problem
Efficient large-scale compromises - Internet communications model
- Software homogeneity
- User naïveity/fatigue
Centralized control - Makes compromised host a commodity good
- Platform economy
Profit-driven applications - Commodity resources (IP, bandwidth, storage, CPU)
- Unique resources (PII/credentials, CD-Keys, address book, etc)
Emergence of economic engine for Internet crime Emergence of economic engine for Internet crime - SPAM, phishing, spyware, etc
Fluid third party markets for illicit digital goods/services - Bots ~$0.5/host, special orders, value added tiers
- Cards, malware, exploits, DDoS, cashout, etc.
3.6 cents per bot week 6 cents per bot week 2.5 cents per bot week September 2004 postings to SpecialHam.com, Spamforum.biz
Defenders reactive, attackers proactive Defenders reactive, attackers proactive - Defenses public, attacker develops/tests in private
- Arms race where best case for defender is to “catch up”
New defenses expensive, new attacks cheap - Defenses sunk costs/business model, attacker agile and not tied to particular technology
Low risk to attacker, high reward to attacker - Minimal deterrence
- Functional anonymity on the Internet; very hard to fix
Defenses hard to measure, attacks easy to measure - Few security metrics (no “evidence-based” security), attackers measure monetization which drives attack quality
We tend to think about this in terms of technical means for securing computer systems We tend to think about this in terms of technical means for securing computer systems Most of 50-100B IT budget on cyber security is spent on securing the end host - AV, firewalls, IDS, encryption, etc…
- Single most expensive front to secure
- Single hardest front to secure
But are individual end hosts valuable to bad guys? - Maybe $1.50? Even less in bulk… not a pain point
What instead? Economically informed strategies - Identify and attack economic bottlenecks in value chain
- This means understanding the return-on-investment for bad guys
We tend to focus on the costs of spam We tend to focus on the costs of spam - > 100 Billion spam emails sent every day [Ironport]
- > $1B in direct costs – anti-spam products/services [IDC]
- Estimates of indirect costs (e.g., productivity) 10-100x more
But spam exists only because it is profitable - Someone is buying! (though no one has admitted it to me…)
Our goal - Understand underlying economic support for spam
Direct Mail: origins in 19th century catalog business Direct Mail: origins in 19th century catalog business - Idea: send unsolicited advertisements to potential customers
- Rough value proposition: Delivery cost < (Conversion rate * Marginal revenue)
Modern direct mail (> $60B in US) - Response rate: ~2.5% (mean per DMA)
- CPM (cost per thousand) = $250 - $1000
Spam is qualitatively the same…
Advantages of e-mail direct marketing - No printing cost
- Legitimate delivery cost low (outsourced price ~ $0.001/message [Get Response])
- Dominated by production & lead generation cost (i.e. mailing list)
- But this is for spam as a legal marketing vehicle… a minority
Spam as marketing/bait for criminal enterprises (scams) - Mailing lists → ε (purchase/steal/harvest) <$10/M retail
- Delivery cost → ε (botnet-based delivery) <$70M retail
Suppose new technology filters out 99.9% of spam (at sites deploying it) Suppose new technology filters out 99.9% of spam (at sites deploying it) - Little impact on delivery cost, mainly lowers conversion rate
- Short term, compensate by sending more different e-mails or to more people
- … and pity the shmucks with the old 95% filter
- Long term, incentive for spammer to bypass filter
Seems likely the outcome of anti-spam has been - Increased amount of spam sent
- Change in distribution of recipient pool
- Unclear what profit impact is (deployment biases)
Anti-spam action Anti-spam action Real-time IP blacklisting Clean up open relays/proxies Content-based learning Site takedown CAPTCHAs
Recall key basic inequality: Recall key basic inequality: (Delivery Cost) < (Conversion Rate) x (Marginal Revenue) We have some handle on two of these (e.g., [Franklin07]) - Delivery cost to send spam
- Outsourced cost: retail purchase price < $70/M addrs
- In-house cost: development/management labor
- Marginal revenue
- Average pharma sale of $100, affiliate commissions ≈ 50%
Conversion rate is fundamentally different - We don’t know; estimates vary by orders of magnitude
No accident that we lack good conversion measures No accident that we lack good conversion measures Its easy to measure spam from a receiver viewpoint - Which MTA sent it to me?
- What does the content contain?
- Where do the links go? etc…
But the key economic issue is only known by the sender - Conversion rate * marginal profit = revenue per msg sent
What to do? - Interview spammers? (0.00036) [Carmack03]
- Guess? (“millions of dollars a day”) [Corman08])
- Send lots of spam and see who clicks on links? (gold standard)
Key idea: distributed C&C is a vulnerability Key idea: distributed C&C is a vulnerability - Botnet authors like de-centralized communications for scalability and resilience, but…
- … to do so, they trust their bots to be good actors
- If you can modify the right bots you can observe and influence actions of the botnet
Rest of today: preliminary results from a case study - Infiltrated Storm P2P botnet, instrumented ~500M spams
- Delivery rates (anti-spam impacts on delivery)
- Click through (visits to spam advertized sites)
- Conversions (purchases and purchase amounts)
Botnet Infiltration Botnet Infiltration - Overview of the Storm peer-to-peer botnet
- Mechanics of botnet spamming
- How can Storm’s C&C be instrumented?
Economic issues - Using a botnet for measurement
- How to measure conversion via C&C interposition
- Measuring spam delivery pipeline
- What happens to spam from when a bot sends it…
- …to when a user clicks “purchase” at a scam site?
Storm is a well-known peer-to-peer botnet Storm is a well-known peer-to-peer botnet Storm has a hierarchical architecture - Workers perform tasks (send spam, launch DDoS attacks, etc.)
- Proxies organize workers, connect to HTTP proxies
- Master servers controlled directly by botmaster
Workers and proxies are compromised hosts (bots) - Use a Distributed Hash Table protocol (Overnet) for rendezvous
- Roughly 20,000 actives bots at any time in April [Kanich08]
Master servers run in “bullet-proof” hosting centers - Communicate with proxies and workers via command and control (C&C) protocol over TCP
New bots decide if they are proxies or workers New bots decide if they are proxies or workers - Inbound connectivity? Yes, proxy. No, worker.
Proxies advertise their status via encrypted variant of Overnet DHT P2P protocol - Master sends “Breath of Life” packet to new proxies to tell them IP address of master servers (RSA signature)
- Allows master servers to be mobile if necessary
Workers use Overnet to find proxies (tricky: time-based key identifies request) Workers send to proxy, proxy forwards to one of master servers in “safe” data center Bottom line: imperfect, but remarkably sophisticated
Workers request “updates” to send spam [Kreibich08] Workers request “updates” to send spam [Kreibich08] - Dictionaries: names, domains, URLs, etc.
- Email templates for producing polymorphic spam
- Macros instantiate fields: %^Fdomains^% from domains dict
- Lists of target email addresses (batches of 500-1000 at a time)
Workers immediately act on these updates - Create a unique message for each email address
- Send the message to the target
- Report the results (success, failure) back to proxies
Many campaign types - Self-propagation malware, pharmaceutical, stocks, phishing, …
Example Storm spam template and instantiation
Templates updated fairly frequently (but mainly just header polymorphism changes) Templates updated fairly frequently (but mainly just header polymorphism changes) A few special campaigns - Test campaigns
- Special mailing list campaigns (e.g. only canadian recpts)
Storm nodes also harvest e-mail addresses - Grovel hard disk and send back foo@bar.baz strings
- Re-integrated into master mailing list (some filtering)
Storm nodes also do DDoS, DNS fast flux proxying and Web proxying Several different levels of message encoding, but nothing really hard to reverse yet
We interpose on Storm command and control network We interpose on Storm command and control network - Reverse-engineered Storm protocols, communication scrambling, rendezvous mechanisms [Kanich08] [Kreibich08]
Run unmodified Storm proxy bots in VMs - Key issue: Real bot workers connect to our proxies
Insert rewriting proxies between workers & proxies - Transparently interpose on messages between Storm proxies and their associated Storm workers
- Generic engine for rewriting traffic based on rules
Interpose to control site URLs and spam delivery - Which sites the spam advertises (replace urls in template links)
- To whom spam gets sent (replace addrs in target list)
Create two sites that mirror actual sites in spam Create two sites that mirror actual sites in spam - E-card (self-propagation) and pharmaceutical
- Replace dictionaries with URLs to our sites
E-card (self-prop) site Pharma site - Log all accesses up through clicks on “purchase”
- Track the contents of shopping carts
Strive for verisimilitude to remove bias (spam filtering) - Site content is similar, URLs have same format as originals, …
Create various test email accounts Create various test email accounts - At Web mail providers: Hotmail, Yahoo!, Gmail
- Behind a commercial spam filtering appliance
- As SMTP sinks: accept every message delivered
Put email addresses in Storm target delivery lists Log all emails delivered to these addresses - Both labeled as spam (“Junk E-mail”) and in inbox
Consequentialism Consequentialism First, do no harm (users no worse off than before) - We do not send any spam
- Proxies are relays, worker bots send spam
- We do not enable additional spam to be sent
- Workers would have connected to some other proxy
- We do not enable spam to be sent to additional users
- Users are already on target lists, only add control addresses
Second, reduce harm where possible - Our pharma sites don’t take credit card info
- Our e-card sites don’t export malicious code
Warning: IANAL (we had lawyers involved though) Warning: IANAL (we had lawyers involved though) CAN*SPAM - Subject to strong definition of “initiator”; we don’t fit it
ECPA - Our proxy is directly addressed by worker bots (“party to” communication carve out)
CFAA - We do not contact worker bots, they contact us (“unauthorized access”?)
- We do not cause any information to be extracted or any fundamentally new activity to take place
- Hard to find a good theory of damages (functionally indistinguishable -- consequentialism)
In this kind of work there is little precedent In this kind of work there is little precedent - No agency to get permission; no way to get indemnity
- Lawyers tend to say “I believe this activity has low risk of…”
- Security researchers in industry, academia
- Affected network operators/registrars
- Law enforcement
- FTC
Lots of operational complexities to a study like this Lots of operational complexities to a study like this Net Ops notices huge Storm infestation Address space cleanliness Registrar issues Abuse complaints Spam site support e-mail Anti-virus signatures Law-enforcement
Experimented with Storm March 21 – April 15, 2008 Experimented with Storm March 21 – April 15, 2008 Instrumented roughly 1.5% of Storm’s total output
Recall that we tracked the contents of shopping carts Recall that we tracked the contents of shopping carts Using the prices on the actual site, we can estimate the value of the purchases - 28 purchases for $2,731 over 25 days, or $100/day ($140 active)
We only interposed on a fraction of the workers - Connected to approx 1.5% of workers
- Back-of-the-envelope (be very careful) $7-10k/day for all, or ~$3M/year
- With a 50% affiliate commission, $1.5M/year revenue
For self-propagation - Roughly 3-9k new bots/day
First measurement study of spam marketing conversion First measurement study of spam marketing conversion Infiltrated Storm botnet, interposed on spam campaigns - Rewriting proxies take advantage of Storm reverse-engineering
Pharmaceutical spam - 1 in 12M conversion rate $1.5M/yr net revenue
- Profitability possibly tied to infrastructure integration
- Sent via retail market, this campaign would not be profitable
- Ergo: in-house delivery (Storm owners = pharma spammers)
Self Propagation spam - 250k spam emails per infection
- Social engineering effective: one in ten visitors run executable
More analysis More analysis - Extending infiltration to ~15 botnets; comparative analysis
- Characteristic fingerprints of different spammers/crews
- Characterizing supply chain relationships
- Broadly order on-line “viagra”, rolexes, etc
- Cluster credit processor/merchant, mailing materials, etc
- Cluster on manufacturing fingerprint (e.g., NIR spectroscopy)
- Measuring monetization by purposely losing credit cards
Proactive defenses - Automated filter generation from templates
- Automated classification of URLs
- Automated vision-based detection of phishing pages
CSE107 – Introduction to modern cryptography CSE107 – Introduction to modern cryptography CSE127 – Computer Security But… Security plays a role in virtually all of your courses
Value-chain characterization Value-chain characterization - Empirical map establishing links between criminal groups and enablers
- Affiliate programs, botnets, fast flux networks, registrars, payment processors, SEO/traffic partners, fulfillment/manufacturing
- Data mining across huge data feeds we’ve built or established relationships for
- Social network among criminal groups
About to start purchasing wide range of spam-advertized products About to start purchasing wide range of spam-advertized products Cluster purchases based on - Merchant and processor
- Packaging (postmark, forensic analysis of paper)
- Artifacts of manufacturing process (e.g., FT-NIR on drugs)
Anyone can send email to our accounts or visit our Web sites, potentially muddying the waters Anyone can send email to our accounts or visit our Web sites, potentially muddying the waters - Use various heuristics to validate the logs
Validate spam in mailboxes was sent by us - Spam from other campaigns, bounce messages, etc.
- Subject line matches our campaign, URL from our dictionary
Validate Web accesses were by users in response - Sites with links in spam are immediately crawled by Google, A/V vendors, etc.
- Special 3rd-level DNS names, special url encoding
- Ignore hosts that access robots.txt, don’t load javascript, don’t load flash, don’t load images, many malformed requests
Dostları ilə paylaş: |