Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP Word Count with approximate result to Word Counter

Tags:

regex

php

I'm programming a small web app to manage texts with external writers, actually the whole thing is great but I have a small problem. And it's related with the word counter.

The writers will be paid based on the number of words in text, the text contains html tags. But the problem is that there are german characters used(Ä, Ö, Ü, ß)

So at the first position I deleted the tags

    $content = strip_tags($content);

then I replace new lines and tabs with simple spaces

    $replace   = array("\r\n", "\n", "\r", "\t");
    $content = str_replace($replace, ' ', $content);

and finally I try to get the number of words

Method 1:

    $characterMap = 'ÄÖÜäöü߀';
    $count = str_word_count($content, 0, $characterMap);

Method 2:

    $to_delete = array('.', ',', ';', "'", '@');
    $content = str_replace($to_delete, '', $content);

    $count = count(preg_split('~[^\p{L}\p{N}\']+~u',$content));

but the results are different to others like the ones from Word, or from CKEditor Plugin word_count.

For example for an Example Text

Word and CkEditor Word Count give 987 Words

Method 1: 968 Words

Method 2: 995 Words

The problem bei the second method are just the - separators by the words, but my question is if there is a better method to find the number of words in a text in php?

like image 493
felipep Avatar asked Jan 27 '26 21:01

felipep


1 Answers

First, you could combine your two replace statements into one -- word count will ignore double spaces. Second, I'm unsure what the objective is of your regex, but it looks mighty strange.

You should be able to simply do this:

$content = strip_tags($content);
$replace = array("\r\n", "\n", "\r", "\t", '.', ',', ';', "'", '@');
$content = str_replace($replace, ' ', $content);
$count = str_word_count($content, 0, $characterMap);
like image 138
brandonscript Avatar answered Jan 29 '26 11:01

brandonscript



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!