Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to detect programming language from a string [closed]

I am looking for a way to test a particular string to determine if it contains code.

For instance, I would like to pass a string such as "body{font-weight: bold;}" and determine that it is CSS.

I would like to do it for:

HTML, CSS, JavaScript, Ruby, C,C++,C#

I am guessing that it would be regex of some sort, but I am pretty stumped!

like image 791
tribe84 Avatar asked Jan 31 '26 22:01

tribe84


1 Answers

You need some kind of a classifier that uses a heurisitic/statistical approach. The accuracy will be better if the input string is larger (e.g. it's hard to say what language = belongs to).

Here's an example of a classifier that uses bayesian methods - http://www.rubyinside.com/sourceclassifier-identifying-programming-languages-quickly-1431.html

The highlight.js script does detection in javascript. Take a look at the source.

like image 154
Noufal Ibrahim Avatar answered Feb 02 '26 11:02

Noufal Ibrahim