Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP foreach loop returns an extra unwanted array (Wikipedia API)

I've been researching this all day and haven't found any solutions. I'm also very new to php.

The purpose of my function is to take user input (Category1) of a Wikipedia article and return its categories. The basic function below does this without any problems.

function get_all_categories ( ) {

        $url = $this->get_url ( 'categories' ) ;
        $url .= 'titles='.urlencode($_POST['Category1']);
        $url .= '&cllimit=500' ;        
        $data = $this->get_result ( $url ) ;

        $array = json_decode($data, true); }

Example result for Urban planning:

Array
(
[batchcomplete] => 
[query] => Array
    (
        [pages] => Array
            (
                [46212943] => Array
                    (
                        [pageid] => 46212943
                        [ns] => 0
                        [title] => Urban planning
                        [categories] => Array
                            (
                                [0] => Array
                                    (
                                        [ns] => 14
                                        [title] => Category:All Wikipedia articles written in American English
                                    )

                                [1] => Array
                                    (
                                        [ns] => 14
                                        [title] => Category:Commons category with local link same as on Wikidata
                                    )

                                [2] => Array
                                    (
                                        [ns] => 14
                                        [title] => Category:Pages using ISBN magic links
                                    )

                                [3] => Array
                                    (
                                        [ns] => 14
                                        [title] => Category:Urban planning
                                    )

                                [4] => Array
                                    (
                                        [ns] => 14
                                        [title] => Category:Use American English from April 2015
                                    )

                                [5] => Array
                                    (
                                        [ns] => 14
                                        [title] => Category:Use dmy dates from April 2015
                                    )

                                [6] => Array
                                    (
                                        [ns] => 14
                                        [title] => Category:Wikipedia articles needing clarification from June 2015
                                    )

                                [7] => Array
                                    (
                                        [ns] => 14
                                        [title] => Category:Wikipedia articles with GND identifiers
                                    )

                            )

                    )

            )

    )

)

My problem begins when I try to extract from this array only the title values. I've attempted to do this with a foreach loop which is the easiest solution I found for multidimensional arrays:

$array1 = new RecursiveIteratorIterator(
        new RecursiveArrayIterator($array),
        RecursiveIteratorIterator::SELF_FIRST);

        foreach ($array1 as $key => $value) {
            if (is_array($value) && $key == 'categories') {
                $result = array_map(function($element){return $element['title'];}, $value);

                print_r($result);
                }               
        }

What I get with this code are two arrays. One array with only the titles (what I wanted), but also an unwanted array (sometime includes the first title) attached to the end:

Array
(
[0] => Category:All Wikipedia articles written in American English
[1] => Category:Commons category with local link same as on Wikidata
[2] => Category:Pages using ISBN magic links
[3] => Category:Urban planning
[4] => Category:Use American English from April 2015
[5] => Category:Use dmy dates from April 2015
[6] => Category:Wikipedia articles needing clarification from June 2015
[7] => Category:Wikipedia articles with GND identifiers
)
Array
(
[ns] => 
[title] => C
)

This extra array is what I don't understand. I think the problem is caused by the foreach loop. I tried unsetting $variable outside of the loop but it didn't help. The extra array becomes especially troublesome if I try to pass these results to another function. How can I prevent this from happening?

like image 472
Sabaghian Avatar asked Jun 03 '26 05:06

Sabaghian


1 Answers

For simplicity you can traverse array manually rather than using RecursiveIteratorIterator.

RecursiveIteratorIterator will kill performance for large arrays.

Change your extracting logic to this:

$result = array();
foreach($arr['batchcomplete']['query']['pages'] as $k => $v)
{
    foreach($v['categories'] as $cat)
    {
        $result[] = $cat['title'];
    }
}

Working Demo

like image 143
Samir Selia Avatar answered Jun 05 '26 21:06

Samir Selia



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!