Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

To sort the letters in a string alphabetically in PostgreSQL

I am currently using this method to sort letters in a string alphabetically in PostgreSQL. Are there any other efficient methods?

select string_agg(c, '') as s
from   (select unnest(regexp_split_to_array('ijsAafhareDbv', '')) as c 
        order  by c) as t; 

       s   
 --------------
 ADaabefhijrsv
like image 609
Kaushik Nayak Avatar asked Nov 29 '25 11:11

Kaushik Nayak


2 Answers

I created 3 functions, one using my query, another using Laurenz's query and one more: I created a Python(plpythonu) function for sorting. Finally, I created a table with 100000 rows( I did it from my Mac Laptop for now ) with each containing a random 15 character string generated using the random_string function in this Link

create table t as select random_string(15) as s FROM generate_series(1,100000);

Here are the 3 functions.

CREATE or REPLACE FUNCTION sort1(x TEXT) RETURNS TEXT AS $$
select string_agg(c, '') as s
from   (select unnest(regexp_split_to_array($1, '')) as c 
        order  by c) as t;
$$ LANGUAGE SQL IMMUTABLE;


CREATE or REPLACE FUNCTION sort2(x TEXT) RETURNS TEXT AS $$
WITH t(s) AS (VALUES ($1))
SELECT string_agg(substr(t.s, g.g, 1), ''
                  ORDER BY substr(t.s, g.g, 1)
                 )
FROM t
   CROSS JOIN LATERAL generate_series(1, length(t.s)) g;

$$ LANGUAGE SQL IMMUTABLE;


create language plpythonu;
CREATE or REPLACE FUNCTION pysort(x text)
  RETURNS text
AS $$
  return ''.join(sorted(x))
$$ LANGUAGE plpythonu IMMUTABLE;

These are the results from EXPLAIN ANALYSE for all three.

knayak=# EXPLAIN ANALYSE select sort1(s)  FROM t;
                                                 QUERY PLAN                                                  
-------------------------------------------------------------------------------------------------------------
 Seq Scan on t  (cost=0.00..26541.00 rows=100000 width=32) (actual time=0.266..7097.740 rows=100000 loops=1)
 Planning time: 0.119 ms
 Execution time: 7106.871 ms
(3 rows)

knayak=# EXPLAIN ANALYSE select sort2(s)  FROM t;
                                                 QUERY PLAN                                                  
-------------------------------------------------------------------------------------------------------------
 Seq Scan on t  (cost=0.00..26541.00 rows=100000 width=32) (actual time=0.418..7012.935 rows=100000 loops=1)
 Planning time: 0.270 ms
 Execution time: 7021.587 ms
(3 rows)

knayak=# EXPLAIN ANALYSE select pysort(s) FROM t;
                                                 QUERY PLAN                                                 
------------------------------------------------------------------------------------------------------------
 Seq Scan on t  (cost=0.00..26541.00 rows=100000 width=32) (actual time=0.060..389.729 rows=100000 loops=1)
 Planning time: 0.048 ms
 Execution time: 395.760 ms
(3 rows)

From this analysis, it turns out - Python sort was the fastest and no considerable differences between the first 2. Need to check the performance in real-time for huge tables in our systems though.

like image 123
Kaushik Nayak Avatar answered Dec 02 '25 03:12

Kaushik Nayak


If you want a solution without regular expressions, you could use this:

WITH t(s) AS (VALUES ('amfjwzeils'))
SELECT string_agg(substr(t.s, g.g, 1), ''
                  ORDER BY substr(t.s, g.g, 1)
                 )
FROM t
   CROSS JOIN LATERAL generate_series(1, length(t.s)) g;

 string_agg 
------------
 aefijlmswz
(1 row)

I would benchmark which solution is faster.

like image 41
Laurenz Albe Avatar answered Dec 02 '25 03:12

Laurenz Albe



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!