Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Specify Character Encoding for Net::HTTP

Tags:

ruby

ruby-2.0

When I make this HTTP request:

Net::HTTP.get_response('www.telize.com',"/geoip/190.88.39.27").body
  => "{\"timezone\":\"America\\/Curacao\",\"isp\":\"United Telecommunication Services (UTS)\",\"country\":\"Cura\xE7ao\",\"dma_code\":\"0\",\"region_code\":\"00\",\"area_code\":\"0\",\"ip\":\"190.88.39.27\",\"asn\":\"AS11081\",\"continent_code\":\"NA\",\"city\":\"Willemstad\",\"longitude\":-68.9167,\"latitude\":12.1,\"country_code\":\"CW\",\"country_code3\":\"CUW\"}\n"

It returns a JSON body, but notice the country: \"country\":\"Cura\xE7ao\". The response body should actually looks like this: "country":"Curaçao". It looks like Net::HTTP is assuming this is ASCII-8BIT:

Net::HTTP.get_response('www.telize.com',"/geoip/190.88.39.27").body.encoding
 => Encoding:ASCII-8BIT

but this can't be the case. How can I tell Net::HTTP which character encoding to use when making the request?

like image 603
Tom Rossi Avatar asked May 08 '26 17:05

Tom Rossi


1 Answers

As the Tin Man determined, "\xE7" is the latin-1 encoding for LATIN SMALL LETTER C WITH CEDILLA, which as far as I can determine isn't a valid json encoding.

But...once you know the encoding, you can change it from ruby's ASCII-8BIT(which just means ruby considers the data to be binary, i.e. unencoded) to UTF-8, like this:

require 'net/http'

server_encoding = "ISO-8859-1"
resp = Net::HTTP.get_response('www.telize.com',"/geoip/190.88.39.27")
json = resp.body.force_encoding(server_encoding).encode("UTF-8")
puts json

--output:--

{"timezone":"America\/Curacao","isp":"United Telecommunication Services
UTS)","country":"Curaçao","dma_code":"0","region_code":"00","area_code":"0",
"ip":"190.88.39.27","asn":"AS11081","continent_code":"NA","city":"Willemstad",
"longitude":-68.9167,"latitude":12.1,"country_code":"CW","country_code3":"CUW"}

It looks like Net::HTTP is assuming this is ASCII-8BIT

Net::HTTP tags the data as binary/ASCII-8BIT, i.e. the data has no encoding, and leaves it to you to figure out how to interpret the data.

like image 154
7stud Avatar answered May 10 '26 09:05

7stud



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!