- swi-prolog
- library
- error.pl -- Error generating support
- apply.pl -- Apply predicates on a list
- lists.pl -- List Manipulation
- debug.pl -- Print debug messages and test assertions
- broadcast.pl -- Event service
- socket.pl -- Network socket (TCP and UDP) library
- predicate_options.pl -- Access and analyse predicate options
- shlib.pl -- Utility library for loading foreign objects (DLLs, shared objects)
- option.pl -- Option list processing
- uid.pl -- User and group management on Unix systems
- unix.pl -- Unix specific operations
- syslog.pl -- Unix syslog interface
- thread_pool.pl -- Resource bounded thread management
- gensym.pl -- Generate unique symbols
- settings.pl -- Setting management
- arithmetic.pl -- Extensible arithmetic
- main.pl -- Provide entry point for scripts
- readutil.pl -- Read utilities
- ssl.pl -- Secure Socket Layer (SSL) library
- crypto.pl -- Cryptography and authentication library
- filesex.pl -- Extended operations on files
- doc_http.pl -- Documentation server
- pldoc.pl -- Process source documentation
- operators.pl -- Manage operators
- pairs.pl -- Operations on key-value lists
- prolog_source.pl -- Examine Prolog source-files
- sgml.pl -- SGML, XML and HTML parser
- quasi_quotations.pl -- Define Quasi Quotation syntax
- uri.pl -- Process URIs
- url.pl -- Analysing and constructing URL
- www_browser.pl -- Open a URL in the users browser
- prolog_colour.pl -- Prolog syntax colouring support.
- record.pl -- Access compound arguments by name
- prolog_xref.pl -- Prolog cross-referencer data collection
- occurs.pl -- Finding and counting sub-terms
- ordsets.pl -- Ordered set manipulation
- assoc.pl -- Binary associations
- ugraphs.pl -- Graph manipulation library
- xpath.pl -- Select nodes in an XML DOM
- iostream.pl -- Utilities to deal with streams
- atom.pl -- Operations on atoms
- porter_stem.pl
- solution_sequences.pl -- Modify solution sequences
- prolog_pack.pl -- A package manager for Prolog
- process.pl -- Create processes and redirect I/O
- memfile.pl
- prolog_config.pl -- Provide configuration information
- git.pl -- Run GIT commands
- ctypes.pl -- Character code classification
- time.pl -- Time and alarm library
- utf8.pl -- UTF-8 encoding/decoding on lists of character codes.
- base64.pl -- Base64 encoding and decoding
- sha.pl -- SHA secure hashes
- crypt.pl
- persistency.pl -- Provide persistent dynamic predicates
- pure_input.pl -- Pure Input from files and streams
- nb_set.pl -- Non-backtrackable sets
- xsdp_types.pl -- XML-Schema primitive types
- uuid.pl -- Universally Unique Identifier (UUID) Library
- pcre.pl -- Perl compatible regular expression matching for SWI-Prolog
- aggregate.pl -- Aggregation operators on backtrackable predicates
- rdf_write.pl -- Write RDF/XML from a list of triples
- rdf.pl -- RDF/XML parser
- sgml_write.pl -- XML/SGML writer module
- archive.pl -- Access several archive formats
- csv.pl -- Process CSV (Comma-Separated Values) data
- dialect.pl -- Support multiple Prolog dialects
- apply_macros.pl -- Goal expansion rules to avoid meta-calling
- prolog_code.pl -- Utilities for reasoning about code
- dif.pl -- The dif/2 constraint
- thread.pl -- High level thread primitives
- rdf_triple.pl -- Create triples from intermediate representation
- rdf_parser.pl
- rewrite_term.pl
- oset.pl -- Ordered set manipulation
- isub.pl -- isub: a string similarity measure
- snowball.pl -- The Snowball multi-lingual stemmer library
- sort.pl
- random.pl -- Random numbers
- pengines.pl -- Pengines: Web Logic Programming Made Easy
- sandbox.pl -- Sandboxed Prolog code
- prolog_format.pl -- Analyse format specifications
- rbtrees.pl -- Red black trees
- nb_rbtrees.pl -- Non-backtrackable operations on red black trees
- zlib.pl -- Zlib wrapper for SWI-Prolog
- terms.pl -- Term manipulation
- modules.pl -- Module utility predicates
- dicts.pl -- Dict utilities
- pwp.pl -- Prolog Well-formed Pages
- tables.pl -- XSB interface to tables
- editline.pl -- BSD libedit based command line editing
- table.pl
- backcomp.pl -- Backward compatibility
- portray_text.pl -- Portray text
- unicode.pl -- Unicode string handling
- plunit.pl -- Unit Testing
- shell.pl -- Elementary shell commands
- rlimit.pl
- protobufs.pl -- Google's Protocol Buffers ("protobufs")
- when.pl -- Conditional coroutining
- doc_latex.pl -- PlDoc LaTeX backend
- threadutil.pl -- Interactive thread utilities
- make.pl -- Reload modified source files
- system.pl -- System utilities
- quintus.pl -- Quintus compatibility
- prolog_breakpoints.pl -- Manage Prolog break-points
- edit.pl -- Editor interface
- listing.pl -- List programs and pretty print clauses
- ansi_term.pl -- Print decorated text to ANSI consoles
- prolog_clause.pl -- Get detailed source-information about a clause
- paxos.pl -- A Replicated Data Store
- redis.pl -- Redis client
- doc_files.pl -- Create stand-alone documentation files
- strings.pl -- String utilities
- udp_broadcast.pl -- A UDP broadcast proxy
- chr.pl
- edinburgh.pl -- Some traditional Edinburgh predicates
- prolog_debug.pl -- User level debugging tools
- base32.pl -- Base32 encoding and decoding
- prolog_history.pl -- Per-directory persistent commandline history
- readline.pl -- GNU readline interface
- check.pl -- Consistency checking
- intercept.pl -- Intercept and signal interface
- optparse.pl -- command line parsing
- zip.pl -- Access resource ZIP archives
- qsave.pl -- Save current program as a state or executable
- prolog_autoload.pl -- Autoload all dependencies
- increval.pl -- Incremental dynamic predicate modification
- help.pl -- Text based manual
- stomp.pl -- STOMP client.
- prolog_stack.pl -- Examine the Prolog stack
- term_to_json.pl
- pengines_io.pl -- Provide Prolog I/O for HTML clients
- prolog_stream.pl -- A stream with Prolog callbacks
- yall.pl -- Lambda expressions
- xmlenc.pl -- XML encryption library
- md5.pl -- MD5 hashes
- statistics.pl -- Get information about resource usage
- hash_stream.pl -- Maintain a hash on a stream
- writef.pl -- Old-style formatted write
- c14n2.pl -- C14n2 canonical XML documents
- date.pl -- Process dates and times
- prolog_codewalk.pl -- Prolog code walker
- double_metaphone.pl -- Phonetic string matching
- files.pl
- prolog_metainference.pl -- Infer meta-predicate properties
- cgi.pl -- Read CGI parameters
- hashtable.pl -- Hash tables
- varnumbers.pl -- Utilities for numbered terms
- yaml.pl -- Process YAML data
- redis_streams.pl -- Using Redis streams
- mqi.pl
- streampool.pl -- Input multiplexing
- lazy_lists.pl -- Lazy list handling
- wfs.pl -- Well Founded Semantics interface
- test_cover.pl -- Clause coverage analysis
- explain.pl -- Describe Prolog Terms
- charsio.pl -- I/O on Lists of Character Codes
- prolog_jiti.pl -- Just In Time Indexing (JITI) utilities
- xmldsig.pl -- XML Digital signature
- prolog_wrap.pl -- Wrapping predicates
- prolog_trace.pl -- Print access to predicates
- tty.pl -- Terminal operations
- library
- isub(+Text1:text, +Text2:text, -Similarity:float, +Options:list) is det
- Similarity is a measure of the similarity/dissimilarity between
Text1 and Text2. E.g.
?- isub('E56.Language', 'languange', D, [normalize(true)]). D = 0.4226950354609929. % [-1,1] range ?- isub('E56.Language', 'languange', D, [normalize(true),zero_to_one(true)]). D = 0.7113475177304964. % [0,1] range ?- isub('E56.Language', 'languange', D, []). % without normalization D = 0.19047619047619047. % [-1,1] range ?- isub(aa, aa, D, []). % does not work for short substrings D = -0.8. ?- isub(aa, aa, D, [substring_threshold(0)]). % works with short substrings D = 1.0. % but may give unwanted values % between e.g. 'store' and 'spore'. ?- isub(joe, hoe, D, [substring_threshold(0)]). D = 0.5315315315315314. ?- isub(joe, hoe, D, []). D = -1.0.
This is a new version of isub/4 which replaces the old version while providing backwards compatibility. This new version allows several options to tweak the algorithm.
- Arguments:
-
Text1 - and Text2 are either an atom, string or a list of characters or character codes. Similarity - is a float in the range [-1,1.0], where 1.0 means most similar. The range can be set to [0,1] with the zero_to_one option described below. Options - is a list with elements described below. Please note that the options are processed at compile time using goal_expansion to provide much better speed. Supported options are: - normalize(+Boolean)
- Applies string normalization as implemented by the original
authors: Text1 and Text2 are mapped
to lowercase and the characters "._ " are removed. Lowercase
mapping is done with the C-library function
towlower()
. In general, the required normalization is domain dependent and is better left to the caller. See e.g., unaccent_atom/2. The default is to skip normalization (false
). - zero_to_one(+Boolean)
- The old isub implementation deviated from the original algorithm
by returning a value in the [0,1] range. This new isub/4 implementation
defaults to the original range of [-1,1], but this option can be set
to
true
to set the output range to [0,1]. - substring_threshold(+Nonneg)
- The original algorithm was meant to compare terms in semantic web ontologies, and it had a hard coded parameter that only considered substring similarities greater than 2 characters. This caused the similarity between, for example 'aa' and 'aa' to return -0.8 which is not expected. This option allows the user to set any threshold, such as 0, so that the similatiry between short substrings can be properly recognized. The default value is 2 which is what the original algorithm used.