Path: csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!news.linkpendium.com!news.linkpendium.com!news.iecc.com!nerds-end From: Gabriel Quadros Newsgroups: comp.compilers Subject: Dealing with load/store instructions on static tainted flow analysis Date: Mon, 6 Jun 2011 21:00:41 -0700 (PDT) Organization: Compilers Central Lines: 30 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <11-06-010@comp.compilers> NNTP-Posting-Host: news.iecc.com X-Trace: gal.iecc.com 1307431572 49384 64.57.183.58 (7 Jun 2011 07:26:12 GMT) X-Complaints-To: abuse@iecc.com NNTP-Posting-Date: Tue, 7 Jun 2011 07:26:12 +0000 (UTC) Keywords: storage, analysis, question Posted-Date: 07 Jun 2011 03:26:12 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: x330-a1.tempe.blueboxinc.net comp.compilers:136 Dear guys, I am trying to implement a pass to detect information leak in programs. The problem is a variation of static tainted-flow analysis: I have some source functions, sink functions and sanitizers. I want to know if it is possible for data to flow from source to sink without going across a sanitizer. I am using LLVM, and I am analyzing the LLVM bitcodes. My pass is working well, but I am having some issues with memory. Once information flows to the heap, it is hard to know how it propagates to the rest of the program. Example: a = SOURCE b = malloc(100) ... b[i] = a ... SINK = c[j] ... So, the problem is that it is hard to know that c != b and i != j. Once information flows into memory, the safest thing to do is to flag the whole memory as a SOURCE. Of course, that is very conservative. I was wondering if you guys could recommend me some strategies and techniques to be more precise. In particular, if you could point me some paper that does it, that would be great. My best regards, Gabriel.