Skip to content

Funkschy/TinySegmenter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TinySegmenter

TinySegmenter is a Clojure library which splits Japanese into words. This is needed because the Japanese stubbornly refuse to use the spacebar.

This library is just a Clojure port of the TinySegmenter javascript library by Taku Kudo (taku@chasen.org). This version is based on the Python 3 version of TinySegmenter

Usage

The library only exports a single function segment, which takes a string (or any char sequence) as an argument.

(require '[tinysegmenter.core :refer [segment]])

(= (segment "私の名前はFelixです")
   ["" "" "名前" "" "Felix" "です"])

Installation

TinySegmenter is available on Clojars

Leiningen

[com.github.funkschy/tinysegmenter "0.1.0"]

Clojure CLI/deps.edn

com.github.funkschy/tinysegmenter {:mvn/version "0.1.0"}

License

This project is distributed under the BSD 3 License, just like the original version by Taku Kudo. See the LICENSE file for more information.

Releases

No releases published

Packages

No packages published