Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #86068 > unrolled thread
| Started by | subhabangalore@gmail.com |
|---|---|
| First post | 2015-02-21 12:51 -0800 |
| Last post | 2015-02-22 10:14 -0800 |
| Articles | 7 — 4 participants |
Back to article view | Back to comp.lang.python
How to design a search engine in Python? subhabangalore@gmail.com - 2015-02-21 12:51 -0800
Re: How to design a search engine in Python? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-02-22 15:42 +1100
Re: How to design a search engine in Python? subhabangalore@gmail.com - 2015-02-21 21:02 -0800
Re: How to design a search engine in Python? Denis McMahon <denismfmcmahon@gmail.com> - 2015-02-22 05:37 +0000
Re: How to design a search engine in Python? subhabangalore@gmail.com - 2015-02-21 22:07 -0800
Re: How to design a search engine in Python? Laura Creighton <lac@openend.se> - 2015-02-22 10:12 +0100
Re: How to design a search engine in Python? subhabangalore@gmail.com - 2015-02-22 10:14 -0800
| From | subhabangalore@gmail.com |
|---|---|
| Date | 2015-02-21 12:51 -0800 |
| Subject | How to design a search engine in Python? |
| Message-ID | <9701e262-7c29-49b6-bb66-8138f484bbea@googlegroups.com> |
Dear Group, I am trying to build a search engine in Python. To do this, I have read tutorials and working methodologies from web and books like Stanford IR book [ http://www-nlp.stanford.edu/IR-book/]. I know how to design a crawler, I know PostgresSql, I am fluent with PageRank, TF-IDF, Zipf's law, etc. I came to know of Whoosh[https://pypi.python.org/pypi/Whoosh/] But I am looking for a total tutorial how to implement it. If any body may kindly direct me. I heard there are good source codes and prototypes, but I am not getting. Apology if this is not a question of the room. I tried to post as this is a room of Python bigwigs. Regards, Subhabrata.
[toc] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2015-02-22 15:42 +1100 |
| Message-ID | <54e95e34$0$13006$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #86068 |
subhabangalore@gmail.com wrote: > Dear Group, > > I am trying to build a search engine in Python. How to design a search engine in Python? First, design a search engine. Then, write Python code to implement that search engine. > To do this, I have read tutorials and working methodologies from web and > books like Stanford IR book [ http://www-nlp.stanford.edu/IR-book/]. I > know how to design a crawler, I know PostgresSql, I am fluent with > PageRank, TF-IDF, Zipf's law, etc. I came to know of > Whoosh[https://pypi.python.org/pypi/Whoosh/] How does your search engine work? What does it do? You MUST be able to describe the workings of your search engine in English, or the natural language of your choice. Write out the steps that it must take, the tasks that it must perform. This is your algorithm. Without an algorithm, how do you expect to write code? What will the code do? Once you have designed your search engine algorithm, then *and only then* should you start to write code to implement that algorithm. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | subhabangalore@gmail.com |
|---|---|
| Date | 2015-02-21 21:02 -0800 |
| Message-ID | <be1e90e1-ea64-4a93-ac7a-1bf6a823dd83@googlegroups.com> |
| In reply to | #86078 |
On Sunday, February 22, 2015 at 10:12:39 AM UTC+5:30, Steven D'Aprano wrote: > wrote: > > > Dear Group, > > > > I am trying to build a search engine in Python. > > How to design a search engine in Python? > > First, design a search engine. > > Then, write Python code to implement that search engine. > > > > To do this, I have read tutorials and working methodologies from web and > > books like Stanford IR book [ http://www-nlp.stanford.edu/IR-book/]. I > > know how to design a crawler, I know PostgresSql, I am fluent with > > PageRank, TF-IDF, Zipf's law, etc. I came to know of > > Whoosh[https://pypi.python.org/pypi/Whoosh/] > > How does your search engine work? What does it do? > > You MUST be able to describe the workings of your search engine in English, > or the natural language of your choice. Write out the steps that it must > take, the tasks that it must perform. This is your algorithm. Without an > algorithm, how do you expect to write code? What will the code do? > > Once you have designed your search engine algorithm, then *and only then* > should you start to write code to implement that algorithm. > > > > > -- > Steven Dear Sir, Thank you for your suggestion. But I was looking for a small tutorial of algorithm of the whole engine. I would try to check it build individual modules and integrate them. I was getting some in google and youtube, but I tried to consult you as I do not know whether they would be fine. I am trying your way, let me see how much I go. There are so many search algorithms in our popular data structure books, that is not an issue but how a search engine is getting done, I am thinking bit on that. Regards, Subhabrata.
[toc] | [prev] | [next] | [standalone]
| From | Denis McMahon <denismfmcmahon@gmail.com> |
|---|---|
| Date | 2015-02-22 05:37 +0000 |
| Message-ID | <mcbpvf$rp4$3@dont-email.me> |
| In reply to | #86079 |
On Sat, 21 Feb 2015 21:02:34 -0800, subhabangalore wrote: > Thank you for your suggestion. But I was looking for a small tutorial of > algorithm of the whole engine. I would try to check it build individual > modules and integrate them. I was getting some in google and youtube, > but I tried to consult you as I do not know whether they would be fine. > I am trying your way, let me see how much I go. There are so many search > algorithms in our popular data structure books, that is not an issue but > how a search engine is getting done, I am thinking bit on that. Presumably a search engine is simply a database of keyword -> result, possibly with some scoring factor. Calculating scoring factor is going to be fun. Then of course result pages might have scoring factors too. What about a search with multiple keywords. Some result pages might match more than one keyword, so you might add their score for each keyword together to get the ranking in that enquiry for that page. But then pages with lots and lots of different keywords might be low scoring, because searchers are looking for content, not pages of keywords. Finally, What special, unique feature is your search engine going to have that makes it better than all the existing ones? -- Denis McMahon, denismfmcmahon@gmail.com
[toc] | [prev] | [next] | [standalone]
| From | subhabangalore@gmail.com |
|---|---|
| Date | 2015-02-21 22:07 -0800 |
| Message-ID | <0914bd64-6a5b-4014-af6e-f41dcb0c4fba@googlegroups.com> |
| In reply to | #86080 |
On Sunday, February 22, 2015 at 11:08:47 AM UTC+5:30, Denis McMahon wrote: > On Sat, 21 Feb 2015 21:02:34 -0800, subhabangalore wrote: > > > Thank you for your suggestion. But I was looking for a small tutorial of > > algorithm of the whole engine. I would try to check it build individual > > modules and integrate them. I was getting some in google and youtube, > > but I tried to consult you as I do not know whether they would be fine. > > I am trying your way, let me see how much I go. There are so many search > > algorithms in our popular data structure books, that is not an issue but > > how a search engine is getting done, I am thinking bit on that. > > Presumably a search engine is simply a database of keyword -> result, > possibly with some scoring factor. > > Calculating scoring factor is going to be fun. > > Then of course result pages might have scoring factors too. What about a > search with multiple keywords. Some result pages might match more than > one keyword, so you might add their score for each keyword together to > get the ranking in that enquiry for that page. > > But then pages with lots and lots of different keywords might be low > scoring, because searchers are looking for content, not pages of keywords. > > Finally, What special, unique feature is your search engine going to have > that makes it better than all the existing ones? > > -- > Denis McMahon, Dear Sir, Thank you for your kind suggestion. Let me traverse one by one. My special feature is generally Semantic Search, but I am trying to build a search engine first and then go for semantic I feel that would give me a solid background to work around the problem. Regards, Subhabrata.
[toc] | [prev] | [next] | [standalone]
| From | Laura Creighton <lac@openend.se> |
|---|---|
| Date | 2015-02-22 10:12 +0100 |
| Message-ID | <mailman.18991.1424596357.18130.python-list@python.org> |
| In reply to | #86081 |
In a message of Sat, 21 Feb 2015 22:07:30 -0800, subhabangalore@gmail.com write >Dear Sir, > >Thank you for your kind suggestion. Let me traverse one by one. >My special feature is generally Semantic Search, but I am trying to build >a search engine first and then go for semantic I feel that would give me a solid background to work around the problem. > >Regards, >Subhabrata. You may find the API docs surrounding rdelbru.github.io/SIREn/ of interest then. Laura Creighton
[toc] | [prev] | [next] | [standalone]
| From | subhabangalore@gmail.com |
|---|---|
| Date | 2015-02-22 10:14 -0800 |
| Message-ID | <b63cc7b5-b72c-4b95-bc7f-320a8bfca2a0@googlegroups.com> |
| In reply to | #86090 |
On Sunday, February 22, 2015 at 2:42:48 PM UTC+5:30, Laura Creighton wrote: > In a message of Sat, 21 Feb 2015 22:07:30 -0800, write > >Dear Sir, > > > >Thank you for your kind suggestion. Let me traverse one by one. > >My special feature is generally Semantic Search, but I am trying to build > >a search engine first and then go for semantic I feel that would give me a solid background to work around the problem. > > > >Regards, > >Subhabrata. > > You may find the API docs surrounding rdelbru.github.io/SIREn/ > of interest then. > > Laura Creighton Dear Madam, Thank you for your kind help. I would surely check then. Regards, Subhabrata.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web