Most of computer science is about providing functionality: Most of computer science is about providing functionality



Yüklə 445 b.
tarix14.05.2018
ölçüsü445 b.





Most of computer science is about providing functionality:

  • Most of computer science is about providing functionality:

    • User Interface
    • Software Design
    • Algorithms
    • Operating Systems/Networking
    • Compilers/PL
    • Microarchitecture
    • VLSI/CAD
  • Computer security is not about functionality

  • It is about how the embodiment of functionality behaves in the presence of an adversary

  • Security mindset – think like a bad guy



Collaborative Center for Internet Epidemiology and Defenses (CCIED)

  • Collaborative Center for Internet Epidemiology and Defenses (CCIED)

    • UCSD/ICSI group created in response to worm threat
    • Very well funded, many strong partners
  • Goals

    • Internet epidemiology: measuring/understanding attacks
    • Automated defenses: stopping outbreaks/attacks
    • Economic and legal issues: that other stuff


50+ papers, lots of tech transfer, big sytems, etc

  • 50+ papers, lots of tech transfer, big sytems, etc

  • Network Telescope

    • Passive monitor for > 1% of routable Internet addr space
  • Potemkin & GQ Honeyfarms

    • Active VM honeypot servers on >250k IP addresses
  • Earlybird

    • On-line learning of new worm signatures in < 1ms


We didn’t stop Internet worms, let alone malware, let alone cybercrime… nor did anyone else.

  • We didn’t stop Internet worms, let alone malware, let alone cybercrime… nor did anyone else.

  • At best, moved it around a bit.

  • By any meaningful metric the bad guys are winning…

  • Mistake: looking at this solely as a technical problem



Efficient large-scale compromises

  • Efficient large-scale compromises

    • Internet communications model
    • Software homogeneity
    • User naïveity/fatigue
  • Centralized control

    • Makes compromised host a commodity good
    • Platform economy
  • Profit-driven applications

    • Commodity resources (IP, bandwidth, storage, CPU)
    • Unique resources (PII/credentials, CD-Keys, address book, etc)


Emergence of economic engine for Internet crime

  • Emergence of economic engine for Internet crime

    • SPAM, phishing, spyware, etc
  • Fluid third party markets for illicit digital goods/services

    • Bots ~$0.5/host, special orders, value added tiers
    • Cards, malware, exploits, DDoS, cashout, etc.


  • 3.6 cents per bot week

  • 6 cents per bot week

  • 2.5 cents per bot week

  • September 2004 postings to SpecialHam.com, Spamforum.biz







Defenders reactive, attackers proactive

  • Defenders reactive, attackers proactive

    • Defenses public, attacker develops/tests in private
    • Arms race where best case for defender is to “catch up”
  • New defenses expensive, new attacks cheap

    • Defenses sunk costs/business model, attacker agile and not tied to particular technology
  • Low risk to attacker, high reward to attacker

    • Minimal deterrence
    • Functional anonymity on the Internet; very hard to fix
  • Defenses hard to measure, attacks easy to measure

    • Few security metrics (no “evidence-based” security), attackers measure monetization which drives attack quality


We tend to think about this in terms of technical means for securing computer systems

  • We tend to think about this in terms of technical means for securing computer systems

  • Most of 50-100B IT budget on cyber security is spent on securing the end host

    • AV, firewalls, IDS, encryption, etc…
    • Single most expensive front to secure
    • Single hardest front to secure
  • But are individual end hosts valuable to bad guys?

    • Maybe $1.50? Even less in bulk… not a pain point
  • What instead? Economically informed strategies

    • Identify and attack economic bottlenecks in value chain
    • This means understanding the return-on-investment for bad guys


We tend to focus on the costs of spam

  • We tend to focus on the costs of spam

    • > 100 Billion spam emails sent every day [Ironport]
    • > $1B in direct costs – anti-spam products/services [IDC]
    • Estimates of indirect costs (e.g., productivity) 10-100x more
  • But spam exists only because it is profitable

    • Someone is buying! (though no one has admitted it to me…)
  • Our goal

    • Understand underlying economic support for spam


Direct Mail: origins in 19th century catalog business

  • Direct Mail: origins in 19th century catalog business

    • Idea: send unsolicited advertisements to potential customers
    • Rough value proposition: Delivery cost < (Conversion rate * Marginal revenue)
  • Modern direct mail (> $60B in US)

    • Response rate: ~2.5% (mean per DMA)
    • CPM (cost per thousand) = $250 - $1000
  • Spam is qualitatively the same…



Advantages of e-mail direct marketing

  • Advantages of e-mail direct marketing

    • No printing cost
    • Legitimate delivery cost low (outsourced price ~ $0.001/message [Get Response])
    • Dominated by production & lead generation cost (i.e. mailing list)
    • But this is for spam as a legal marketing vehicle… a minority
  • Spam as marketing/bait for criminal enterprises (scams)

    • Mailing lists → ε (purchase/steal/harvest) <$10/M retail
    • Delivery cost → ε (botnet-based delivery) <$70M retail


Suppose new technology filters out 99.9% of spam (at sites deploying it)

  • Suppose new technology filters out 99.9% of spam (at sites deploying it)

    • Little impact on delivery cost, mainly lowers conversion rate
    • Short term, compensate by sending more different e-mails or to more people
      • … and pity the shmucks with the old 95% filter
    • Long term, incentive for spammer to bypass filter
  • Seems likely the outcome of anti-spam has been

    • Increased amount of spam sent
    • Change in distribution of recipient pool
    • Unclear what profit impact is (deployment biases)


Anti-spam action

  • Anti-spam action

  • Real-time IP blacklisting

  • Clean up open relays/proxies

  • Content-based learning

  • Site takedown

  • CAPTCHAs





Recall key basic inequality:

  • Recall key basic inequality:

  • (Delivery Cost) < (Conversion Rate) x (Marginal Revenue)

  • We have some handle on two of these (e.g., [Franklin07])

    • Delivery cost to send spam
      • Outsourced cost: retail purchase price < $70/M addrs
      • In-house cost: development/management labor
    • Marginal revenue
      • Average pharma sale of $100, affiliate commissions ≈ 50%
  • Conversion rate is fundamentally different

    • We don’t know; estimates vary by orders of magnitude


No accident that we lack good conversion measures

  • No accident that we lack good conversion measures

  • Its easy to measure spam from a receiver viewpoint

    • Which MTA sent it to me?
    • What does the content contain?
    • Where do the links go? etc…
  • But the key economic issue is only known by the sender

    • Conversion rate * marginal profit = revenue per msg sent
  • What to do?

    • Interview spammers? (0.00036) [Carmack03]
    • Guess? (“millions of dollars a day”) [Corman08])
    • Send lots of spam and see who clicks on links? (gold standard)


Key idea: distributed C&C is a vulnerability

  • Key idea: distributed C&C is a vulnerability

    • Botnet authors like de-centralized communications for scalability and resilience, but…
    • … to do so, they trust their bots to be good actors
    • If you can modify the right bots you can observe and influence actions of the botnet
  • Rest of today: preliminary results from a case study

    • Infiltrated Storm P2P botnet, instrumented ~500M spams
    • Delivery rates (anti-spam impacts on delivery)
    • Click through (visits to spam advertized sites)
    • Conversions (purchases and purchase amounts)


Botnet Infiltration

  • Botnet Infiltration

    • Overview of the Storm peer-to-peer botnet
      • How does Storm work?
    • Mechanics of botnet spamming
      • How can Storm’s C&C be instrumented?
  • Economic issues

    • Using a botnet for measurement
      • How to measure conversion via C&C interposition
    • Measuring spam delivery pipeline
      • What happens to spam from when a bot sends it…
      • …to when a user clicks “purchase” at a scam site?


Storm is a well-known peer-to-peer botnet

  • Storm is a well-known peer-to-peer botnet

  • Storm has a hierarchical architecture

    • Workers perform tasks (send spam, launch DDoS attacks, etc.)
    • Proxies organize workers, connect to HTTP proxies
    • Master servers controlled directly by botmaster
  • Workers and proxies are compromised hosts (bots)

    • Use a Distributed Hash Table protocol (Overnet) for rendezvous
    • Roughly 20,000 actives bots at any time in April [Kanich08]
  • Master servers run in “bullet-proof” hosting centers

    • Communicate with proxies and workers via command and control (C&C) protocol over TCP




New bots decide if they are proxies or workers

  • New bots decide if they are proxies or workers

    • Inbound connectivity? Yes, proxy. No, worker.
  • Proxies advertise their status via encrypted variant of Overnet DHT P2P protocol

    • Master sends “Breath of Life” packet to new proxies to tell them IP address of master servers (RSA signature)
    • Allows master servers to be mobile if necessary
  • Workers use Overnet to find proxies (tricky: time-based key identifies request)

  • Workers send to proxy, proxy forwards to one of master servers in “safe” data center

  • Bottom line: imperfect, but remarkably sophisticated



Workers request “updates” to send spam [Kreibich08]

  • Workers request “updates” to send spam [Kreibich08]

    • Dictionaries: names, domains, URLs, etc.
    • Email templates for producing polymorphic spam
      • Macros instantiate fields: %^Fdomains^% from domains dict
    • Lists of target email addresses (batches of 500-1000 at a time)
  • Workers immediately act on these updates

    • Create a unique message for each email address
    • Send the message to the target
    • Report the results (success, failure) back to proxies
  • Many campaign types

    • Self-propagation malware, pharmaceutical, stocks, phishing, …


  • Example Storm spam template and instantiation



Templates updated fairly frequently (but mainly just header polymorphism changes)

  • Templates updated fairly frequently (but mainly just header polymorphism changes)

  • A few special campaigns

    • Test campaigns
    • Special mailing list campaigns (e.g. only canadian recpts)
  • Storm nodes also harvest e-mail addresses

    • Grovel hard disk and send back foo@bar.baz strings
    • Re-integrated into master mailing list (some filtering)
  • Storm nodes also do DDoS, DNS fast flux proxying and Web proxying

  • Several different levels of message encoding, but nothing really hard to reverse yet





We interpose on Storm command and control network

  • We interpose on Storm command and control network

    • Reverse-engineered Storm protocols, communication scrambling, rendezvous mechanisms [Kanich08] [Kreibich08]
  • Run unmodified Storm proxy bots in VMs

    • Key issue: Real bot workers connect to our proxies
  • Insert rewriting proxies between workers & proxies

    • Transparently interpose on messages between Storm proxies and their associated Storm workers
    • Generic engine for rewriting traffic based on rules
  • Interpose to control site URLs and spam delivery

    • Which sites the spam advertises (replace urls in template links)
    • To whom spam gets sent (replace addrs in target list)




Create two sites that mirror actual sites in spam

  • Create two sites that mirror actual sites in spam

    • E-card (self-propagation) and pharmaceutical
    • Replace dictionaries with URLs to our sites
  • E-card (self-prop) site

  • Pharma site

    • Log all accesses up through clicks on “purchase”
    • Track the contents of shopping carts
  • Strive for verisimilitude to remove bias (spam filtering)

    • Site content is similar, URLs have same format as originals, …




Create various test email accounts

  • Create various test email accounts

    • At Web mail providers: Hotmail, Yahoo!, Gmail
    • Behind a commercial spam filtering appliance
    • As SMTP sinks: accept every message delivered
  • Put email addresses in Storm target delivery lists

  • Log all emails delivered to these addresses

    • Both labeled as spam (“Junk E-mail”) and in inbox


Consequentialism

  • Consequentialism

  • First, do no harm (users no worse off than before)

    • We do not send any spam
      • Proxies are relays, worker bots send spam
    • We do not enable additional spam to be sent
      • Workers would have connected to some other proxy
    • We do not enable spam to be sent to additional users
      • Users are already on target lists, only add control addresses
  • Second, reduce harm where possible

    • Our pharma sites don’t take credit card info
    • Our e-card sites don’t export malicious code


Warning: IANAL (we had lawyers involved though)

  • Warning: IANAL (we had lawyers involved though)

  • CAN*SPAM

    • Subject to strong definition of “initiator”; we don’t fit it
  • ECPA

    • Our proxy is directly addressed by worker bots (“party to” communication carve out)
  • CFAA

    • We do not contact worker bots, they contact us (“unauthorized access”?)
    • We do not cause any information to be extracted or any fundamentally new activity to take place
    • Hard to find a good theory of damages (functionally indistinguishable -- consequentialism)


In this kind of work there is little precedent

  • In this kind of work there is little precedent

    • No agency to get permission; no way to get indemnity
    • Lawyers tend to say “I believe this activity has low risk of…”
  • We communicate our activities to a lot of people

    • Security researchers in industry, academia
    • Affected network operators/registrars
    • Law enforcement
    • FTC


Lots of operational complexities to a study like this

  • Lots of operational complexities to a study like this

  • Net Ops notices huge Storm infestation

  • Address space cleanliness

  • Registrar issues

    • GoDaddy
    • TUCOWS
  • Abuse complaints

  • Spam site support e-mail

  • Anti-virus signatures

  • Law-enforcement



Experimented with Storm March 21 – April 15, 2008

  • Experimented with Storm March 21 – April 15, 2008

  • Instrumented roughly 1.5% of Storm’s total output





Recall that we tracked the contents of shopping carts

  • Recall that we tracked the contents of shopping carts

  • Using the prices on the actual site, we can estimate the value of the purchases

    • 28 purchases for $2,731 over 25 days, or $100/day ($140 active)
  • We only interposed on a fraction of the workers

    • Connected to approx 1.5% of workers
    • Back-of-the-envelope (be very careful)  $7-10k/day for all, or ~$3M/year
    • With a 50% affiliate commission, $1.5M/year revenue
  • For self-propagation

    • Roughly 3-9k new bots/day


First measurement study of spam marketing conversion

  • First measurement study of spam marketing conversion

  • Infiltrated Storm botnet, interposed on spam campaigns

    • Rewriting proxies take advantage of Storm reverse-engineering
  • Pharmaceutical spam

    • 1 in 12M conversion rate  $1.5M/yr net revenue
    • Profitability possibly tied to infrastructure integration
    • Sent via retail market, this campaign would not be profitable
    • Ergo: in-house delivery (Storm owners = pharma spammers)
  • Self Propagation spam

    • 250k spam emails per infection
    • Social engineering effective: one in ten visitors run executable


More analysis

  • More analysis

    • Extending infiltration to ~15 botnets; comparative analysis
    • Characteristic fingerprints of different spammers/crews
    • Characterizing supply chain relationships
      • Broadly order on-line “viagra”, rolexes, etc
      • Cluster credit processor/merchant, mailing materials, etc
      • Cluster on manufacturing fingerprint (e.g., NIR spectroscopy)
    • Measuring monetization by purposely losing credit cards
  • Proactive defenses

    • Automated filter generation from templates
    • Automated classification of URLs
    • Automated vision-based detection of phishing pages


CSE107 – Introduction to modern cryptography

  • CSE107 – Introduction to modern cryptography

  • CSE127 – Computer Security

  • But…

  • Security plays a role in virtually all of your courses





Value-chain characterization

  • Value-chain characterization

    • Empirical map establishing links between criminal groups and enablers
      • Affiliate programs, botnets, fast flux networks, registrars, payment processors, SEO/traffic partners, fulfillment/manufacturing
      • Data mining across huge data feeds we’ve built or established relationships for
    • Social network among criminal groups


About to start purchasing wide range of spam-advertized products

  • About to start purchasing wide range of spam-advertized products

    • Watches
    • Pharma
    • Traffic
  • Cluster purchases based on

    • Merchant and processor
    • Packaging (postmark, forensic analysis of paper)
    • Artifacts of manufacturing process (e.g., FT-NIR on drugs)








Anyone can send email to our accounts or visit our Web sites, potentially muddying the waters

  • Anyone can send email to our accounts or visit our Web sites, potentially muddying the waters

    • Use various heuristics to validate the logs
  • Validate spam in mailboxes was sent by us

    • Spam from other campaigns, bounce messages, etc.
    • Subject line matches our campaign, URL from our dictionary
  • Validate Web accesses were by users in response

    • Sites with links in spam are immediately crawled by Google, A/V vendors, etc.
    • Special 3rd-level DNS names, special url encoding
    • Ignore hosts that access robots.txt, don’t load javascript, don’t load flash, don’t load images, many malformed requests








Dostları ilə paylaş:


Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2017
rəhbərliyinə müraciət

    Ana səhifə