Path: csiph.com!xmission!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: David Lovemore Newsgroups: comp.compilers Subject: Re: An added line to Tarjan's SCC algorithm in Muchnik's Advanced Compiler Design Date: Mon, 27 Aug 2018 04:51:49 -0700 (PDT) Organization: Compilers Central Lines: 131 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <18-08-010@comp.compilers> References: <18-08-001@comp.compilers> <18-08-003@comp.compilers> <18-08-008@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="67177"; mail-complaints-to="abuse@iecc.com" Keywords: books, analysis Posted-Date: 27 Aug 2018 09:21:32 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <18-08-008@comp.compilers> Xref: csiph.com comp.compilers:2110 On Friday, August 24, 2018 at 3:30:24 PM UTC+1, woon...@gmail.com wrote: > David Lovemore wrote: > > On Thursday, August 16, 2018 at 1:41:12 PM UTC+1, woon...@gmail.com wrote: > > > The official errata for the book, Advanced Compiler Design and Implementation > > > by Steven Muchnik added, among other fixes, this line to the Tarjan's algorithm > > > to find maximal strongly connected components (SCCs) from a directed graph. > > > > > > All_SCC U= {{Stack[1]}} > > > > > > where I replaced with brackets an arrow to mean retrival of an element from a > > > sequence named Stack. ... > [...] > > https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algori thm > > Thanks for the reference. I compared and found no differences. > > If the line in question had come from the text instead of the errata, > I would not have posted this question. I still wonder why the author > decided to add the line that looks unnecessary. It is unnecessary. Thinking through why this is the case I found some simplifications to the code. Strong_Components only ever returns anything on the stack if it has found an edge to an earlier node on the stack. When a node is reached that can't reach an earlier node, that node and everything that is newer than it on the stack is removed from the stack and recorded as an SCC. Importantly in each call from the top level, lowLink[x] will never decrease as there can't be a reachable y on the stack with a lower Dfn[y]. So, at the end of the function all the nodes remaining on the stack will be popped as an SCC and added to All_SCC. My simplifications are: 1. It is unnecessary to label LowLink on all the nodes as it can be kept as a local variable and returned from Strong_Components. 2. Instead of recording Dfn(x), we can record Index(x) = index of x when placed on Stack. As x is never placed on Stack more than once this is unique to x. When we need to test for y being on Stack, we can check Stack(Index(y))==y. This means we have no need for a separate "OnStack" flag/hashtable. 3. The need for an extra check that Index(y) is still an index on the stack can be avoided. (So we omit "Index(y) < len(Stack)" check before checking "Stack(Index(y))==y" as it is implied by "Index(y) < lowestFound".) 2 and 3 assume that you have an equality test on nodes. Here is my Python code for reference: def TarjanSCC(nodes, Succ): Dfn = {x:0 for x in nodes} LowLink = dict() NextDfn = 0 Stack = [] All_SCC = [] def Strong_Components(x, Succ): nonlocal NextDfn nonlocal All_SCC NextDfn+=1 Dfn[x] = NextDfn LowLink[x] = Dfn[x] Stack.append(x) for y in Succ[x]: if Dfn[y] == 0: Strong_Components(y, Succ) LowLink[x] = min(LowLink[x], LowLink[y]) elif Dfn[y] < Dfn[x] and y in Stack: LowLink[x] = min(LowLink[x], Dfn[y]) if LowLink[x] == Dfn[x]: # All nodes on Stack are ascending in Dfn. # All nodes between when Dfn[x] was pushed and # top of stack are SCC. SCC = set() while Stack: z = Stack[-1] if Dfn[z] < Dfn[x]: All_SCC.append(SCC) return Stack.pop() SCC.add(z) All_SCC.append(SCC) for x in nodes: if Dfn[x] == 0: Strong_Components(x,Succ) return All_SCC def SimpleSCC(nodes, Succ): index = dict() # undefined if unseen else stack index when on stack stack = [] all_SCC = [] def lowestReachable(x, succ): """Returns lowest index on stack of reachable nodes from x. All found SCCs are added to all_SCC.""" nonlocal stack nonlocal all_SCC lowestFound = len(stack) index[x] = len(stack) stack.append(x) for y in succ[x]: i = index.get(y) if i is None: # y not in Index lowestFound = min(lowestFound, lowestReachable(y, succ)) elif i < lowestFound and stack[i]==y: lowestFound = i # y is node lower on stack. if lowestFound == index[x]: # No lower node found. # Nodes between lowestFound and top of stack are single SCC. all_SCC.append(set(stack[lowestFound:])) stack = stack[:lowestFound] # Trim stack to low elements. return lowestFound for x in nodes: if x not in index: lowestReachable(x,Succ) return all_SCC nodes = [1,2,3,4,5,6,7,8] Succ={1:[2,3], 2:[4,5], 3:[4], 4:[3,7,8], 5:[2], 6:[3], 7:[7], 8:[3,7]} print(nodes) print(Succ) print(TarjanSCC(nodes, Succ)) print(SimpleSCC(nodes, Succ))