Javascript – How to match multiple occurrences with a regex in JavaScript similar to PHP’s preg_match_all()

javascriptregex

I am trying to parse url-encoded strings that are made up of key=value pairs separated by either & or &.

The following will only match the first occurrence, breaking apart the keys and values into separate result elements:

var result = mystring.match(/(?:&|&)?([^=]+)=([^&]+)/)

The results for the string '1111342=Adam%20Franco&348572=Bob%20Jones' would be:

['1111342', 'Adam%20Franco']

Using the global flag, 'g', will match all occurrences, but only return the fully matched sub-strings, not the separated keys and values:

var result = mystring.match(/(?:&|&)?([^=]+)=([^&]+)/g)

The results for the string '1111342=Adam%20Franco&348572=Bob%20Jones' would be:

['1111342=Adam%20Franco', '&348572=Bob%20Jones']

While I could split the string on & and break apart each key/value pair individually, is there any way using JavaScript's regular expression support to match multiple occurrences of the pattern /(?:&|&)?([^=]+)=([^&]+)/ similar to PHP's preg_match_all() function?

I'm aiming for some way to get results with the sub-matches separated like:

[['1111342', '348572'], ['Adam%20Franco', 'Bob%20Jones']]

or

[['1111342', 'Adam%20Franco'], ['348572', 'Bob%20Jones']]

Best Answer

Hoisted from the comments

2020 comment: rather than using regex, we now have URLSearchParams, which does all of this for us, so no custom code, let alone regex, are necessary anymore.

Mike 'Pomax' Kamermans

Browser support is listed here https://caniuse.com/#feat=urlsearchparams


I would suggest an alternative regex, using sub-groups to capture name and value of the parameters individually and re.exec():

function getUrlParams(url) {
  var re = /(?:\?|&(?:amp;)?)([^=&#]+)(?:=?([^&#]*))/g,
      match, params = {},
      decode = function (s) {return decodeURIComponent(s.replace(/\+/g, " "));};

  if (typeof url == "undefined") url = document.location.href;

  while (match = re.exec(url)) {
    params[decode(match[1])] = decode(match[2]);
  }
  return params;
}

var result = getUrlParams("http://maps.google.de/maps?f=q&source=s_q&hl=de&geocode=&q=Frankfurt+am+Main&sll=50.106047,8.679886&sspn=0.370369,0.833588&ie=UTF8&ll=50.116616,8.680573&spn=0.35972,0.833588&z=11&iwloc=addr");

result is an object:

{
  f: "q"
  geocode: ""
  hl: "de"
  ie: "UTF8"
  iwloc: "addr"
  ll: "50.116616,8.680573"
  q: "Frankfurt am Main"
  sll: "50.106047,8.679886"
  source: "s_q"
  spn: "0.35972,0.833588"
  sspn: "0.370369,0.833588"
  z: "11"
}

The regex breaks down as follows:

(?:            # non-capturing group
  \?|&         #   "?" or "&"
  (?:amp;)?    #   (allow "&", for wrongly HTML-encoded URLs)
)              # end non-capturing group
(              # group 1
  [^=&#]+      #   any character except "=", "&" or "#"; at least once
)              # end group 1 - this will be the parameter's name
(?:            # non-capturing group
  =?           #   an "=", optional
  (            #   group 2
    [^&#]*     #     any character except "&" or "#"; any number of times
  )            #   end group 2 - this will be the parameter's value
)              # end non-capturing group