Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!border3.nntp.dca.giganews.com!Xl.tags.giganews.com!border1.nntp.dca.giganews.com!nntp.giganews.com!local2.nntp.dca.giganews.com!nntp.earthlink.com!news.earthlink.com.POSTED!not-for-mail NNTP-Posting-Date: Sun, 15 Apr 2012 11:17:10 -0500 Date: Sun, 15 Apr 2012 09:17:07 -0700 From: Patricia Shanahan User-Agent: Mozilla/5.0 (Windows NT 5.2; WOW64; rv:11.0) Gecko/20120327 Thunderbird/11.0.1 MIME-Version: 1.0 Newsgroups: comp.lang.java.programmer Subject: Re: Pattern suggestion References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Message-ID: Lines: 22 X-Usenet-Provider: http://www.giganews.com NNTP-Posting-Host: 70.230.201.141 X-Trace: sv3-r5M7zNgqNCRt4jY8uD21Hk+ZaspUVVqBA4dCGpleWZ+UhK6POe/5IQIKYMS/teiow8CPdKdkOrWAiCV!AvB0gTwhTSaEY9dzV5ldMICjcjgbRObHkhkv2nSSsxH9GGxV/I/LzHMN5XjfezcP1FLAD3OnCThW!pl8GMT+ApTgguwyAfEVre/KdUZC9nHoxeotJM4kCAbl/xqM= X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly X-Postfilter: 1.3.40 X-Original-Bytes: 2069 Xref: csiph.com comp.lang.java.programmer:13560 On 4/15/2012 7:11 AM, FrenKy wrote: > Hi *, > I have a huge file (~10GB) which I'm reading line by line. Each line has > to be analyzed by many number of different analyzers. The problem I have > is that to make it at least a bit performance optimized due to sometimes > time consuming processing (usually because of delays due to external > interfaces) i would need to make it heavily multithreaded. > File should be read only once to reduce IO on disks. > > So I need "1 driver to many workers" pattern where workers are > multithreaded. > > I have a solution now based on Observable/Observer that I use (and it > works) but I'm not sure if it is the best way. I suggest taking a look at java.util.concurrent.ThreadPoolExecutor and related classes. Try to minimize ordering relationships between processing on the lines, so that you can overlap work on multiple lines as much as possible. Patricia