Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #197190

How to weight terms based on semantic importance

Path csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail
From marc nicole <mk1853387@gmail.com>
Newsgroups comp.lang.python
Subject How to weight terms based on semantic importance
Date Wed, 15 Jan 2025 18:40:43 +0100
Lines 9
Message-ID <mailman.80.1736963341.2912.python-list@python.org> (permalink)
References <CAGJtH9TYE-MEqSUHWO-JW5j-d2CtUqet7A_R2fn7A25iScGpFg@mail.gmail.com>
Mime-Version 1.0
Content-Type text/plain; charset="UTF-8"
X-Trace news.uni-berlin.de 6WlaTgLXz7kXUHlNiFLrGg3wlU4dSjvvZYbuHze3SSOQ==
Cancel-Lock sha1:KNWsHnZzsCLonk3D12HEGRQE/Rw= sha256:dYTjIOaQozc6f6wOK9uKBHLzJvBSdX4TxdMYru1RLb0=
Return-Path <mk1853387@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
Authentication-Results mail.python.org; dkim=pass reason="2048-bit key; unprotected key" header.d=gmail.com header.i=@gmail.com header.b=DFKoolsY; dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status OK 0.192
X-Spam-Level *
X-Spam-Evidence '*H*': 0.65; '*S*': 0.03; 'example:': 0.09; 'nltk': 0.16; 'semantics': 0.16; 'weights': 0.16; 'to:addr:python-list': 0.20; 'to:no real name:2**1': 0.22; 'sfxlen:2': 0.31; 'subject:How': 0.31; 'message-id:@mail.gmail.com': 0.31; 'there': 0.33; 'received:google.com': 0.34; 'from:addr:gmail.com': 0.34; 'using': 0.37; 'others': 0.37; 'way': 0.38; 'thanks': 0.39; 'hello,': 0.39; 'text': 0.39; 'want': 0.40; 'terms': 0.69; 'weight': 0.84; 'frequency': 0.84; 'subject:based': 0.84; 'etc...': 0.91
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736963338; x=1737568138; darn=python.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=7wtmvXlypr2Nc0m8v3GwAbjYUl/Fy9npQKP4sIb/Xfg=; b=DFKoolsYmRVaY464vt3MHs1MO82Yuqb+sEkiclSg5R9qt6XzhzHtvhLWKj3pI/DnU9 dt3ygjMo7SuQfAxCtIA8N2+ARLLOt9gLeCeqZPvImZFRrf0c80gRgbJlzOEtnZZeNRZ+ WRUlTWlMgUxpa89gWteYquHEAEca+93cF53dFh9sLbCAN3u4G2WtN17yL7YGjWqVcWHe dnGkOhEuUuRKazD1nGe0K17QBde6SOGZngw69RFjL13tDJczwFYrTpaGPR9YJakQaG/m 4TRhxNH7cyG7+0CXYy2xrSxBSf1/8mAM//RaxqmAjymR8dCzXOqAhc+t+0foMcO0tMz2 oqdw==
X-Google-DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736963338; x=1737568138; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=7wtmvXlypr2Nc0m8v3GwAbjYUl/Fy9npQKP4sIb/Xfg=; b=vF22XgZE8/Bc/yIbLvNX+hc33RhhHtGzOmmYbLxtdOmhD4DlyqA9X8cqP+/4UBKZZH lGmIoawyBtv6w/tu+YG2zCwtcAqauAA3T9KFyQVlxE3UgHUE7btjG7CjlkkhYpwU3mph YWbCsL7w1Q5IE29FCuDzetABBeWyovr27BOU66ap1hDH2pj+dUeR6MEdLAFHRbI4Rq2O r2bmMVAVrD5U5mL/r5gpOYor+XsQoyVh3xGs/v4C6eNKJx5pyJFtPYV9EV0SeRkqKl3T gjX8TtSDT6ghDv6BzgGpMIDdLY0RnCN6XARj67X/PX+kQe3k9Ldpd2PT+mJfisbXj+Va kKow==
X-Forwarded-Encrypted i=1; AJvYcCW5Je0D88zxST8EZeKbuoeiPplZONvicQEwFJ3nH20czYD7/zN65Nciy7WN8LRl93rX35UnCIaYt/c6hA==@python.org
X-Gm-Message-State AOJu0YxXYIw0xj2kIhLNnC+gxStBxHYV42FQQTYCjXL+kdXwHf2O1S8m huXqzZGgM5sfD4N/v+d9gAr3+AK/HQEWPj3+EtpZOKZoQqV/uP/BNSbHY8mkBjWiPBJZ8QTGMXN cI8i3Qeep7NGOJWkzs/zcfhIY6zdePDyx
X-Gm-Gg ASbGncv365Xu2F9n6/4v5AlJ1YjuUjfe88THY3cS1V8rp4Zeb3y2YfZU8uJrqdEBYPN NQaOANugL6fI21TERnOLU1hJwa0e4cd+s3XoYen/R
X-Google-Smtp-Source AGHT+IGTOo1No/bshzDoIRm95gEpBeFpwaecd7vp69hf/BiYqTUhM1RGwgN0dIjb7HwF9l874JiVe69avmXhG4YLJfQ=
X-Received by 2002:a05:690c:6b11:b0:6e2:fcb5:52fa with SMTP id 00721157ae682-6f6c9b20b7amr31334167b3.9.1736962854560; Wed, 15 Jan 2025 09:40:54 -0800 (PST)
X-Gm-Features AbW1kvaTLhxFjIwId_ToFLGXls3fxyAnjABoKsPivKbiuKnoPCe-1XvgwLVt9DI
X-Content-Filtered-By Mailman/MimeDel 2.1.39
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.39
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <https://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID <CAGJtH9TYE-MEqSUHWO-JW5j-d2CtUqet7A_R2fn7A25iScGpFg@mail.gmail.com>
Xref csiph.com comp.lang.python:197190

Show key headers only | View raw


Hello,

I want to weight terms of a large text based on their semantics (not on
their frequency (TF-IDF)).
Is there a way to do that using NLTK or other means? through a vectorizer?

For example: a certain term weights more than others etc...

Thanks

Back to comp.lang.python | Previous | Next | Find similar


Thread

How to weight terms based on semantic importance marc nicole <mk1853387@gmail.com> - 2025-01-15 18:40 +0100

csiph-web