JMeter – Regular Expression Extractor

JMeter - Regular Expression Extractor

‘Correlation’ term refers to the handling of dynamic values coming from the server. These dynamic values are the unique values which are generated by the server for security purpose like the session ID, authorization token etc. In some cases, dynamic values also refer to the web content like values in a drop-down list, calendar date, item ID, product ID, order number etc. Through correlation, you can capture these dynamic values and pass in the subsequent requests. This is the basic concept of ‘Correlation’ or ‘Handling of Dynamic Values’ in JMeter.

Why do we need to correlate the dynamic values?

To get the answer to this question, firstly you need to understand what exactly happens at the time of script recording, script replay and after correlating the dynamic values.

Let’s see, how the client and server act when some dynamic values are exchanged between them? This is the scenario while you record a script:

JMeter - Regular Expression Extractor - Example
Figure 01: Recording Scenario

When you replay the script without any changes then the script fails because of the dynamic value (sessionid) generated by the server does not match with the value return back by the client. Refer to the given Figure 02, while replaying the script, the server generated the session id as 222, but vuser script sent the recorded value i.e. 111 which was captured during recording (Figure 01). Hence the server refused to serve the request and threw an error.

Figure 02: Replay without correlation Scenario

Now, when you correlate the dynamic value (sessionid) and replay the script then the vuser script captures and saves the latest value generated by the server and sends back to the server in the next request. The server validates the returned value with generated value and gives the proper response. Hence the request is marked as passed at user end.

Figure 03

Hope you understood, why correlation is so important in VuGen scripting.

Now, the second thing that you should know is:

What are the common values which require correlation?

  1. Session ID
  2. Access Token
  3. Customer Name / ID
  4. Order Number
  5. Bill Number
  6. Number of records displayed on a page
  7. Current Date and Time

There could be more values which depend on the type of the application and the term used to denote them. Keep in mind your ultimate goal must be to find out all the dynamic values which cause the failure of the script and correlate them. Now, the next question arises “HOW?”

In JMeter, Regular Expression plays an important role. Regular Expression is used to identify dynamic values come in a response. In Microfocus Performance Center/LoadRunner, dynamic values are captured by Correlation. Regular Expression is a pattern which is used to specify a set of strings required for getting dynamic value. To prepare a regular expression, it is important to learn how we can create it using metacharacter and literal character. Here, I will not discuss too much about regular expression syntax (token), otherwise, it will be distracted us from our original topic. If you want to learn how to write regular expression then refer to my post where you can learn regular expression token in a simple way along with examples. 
Learn – How to write a regular expression?

Now, come back to the regular expression extractor element of JMeter. Regular Expression Extractor is a post processor (execute after the response arrives) and it is always added under a sampler whose response contains dynamic value(s) and you need to capture and pass that dynamic value(s) in the next request (where required).   

How to add ‘Regular Expression Extractor’?

Follow the below steps:

  1. Select the ‘Sampler’ element whose response contains the dynamic value and you want to capture it.
  2. Right-click on the element
  3. Hover the mouse on ‘Add’
  4. Hover the mouse on ‘Post Processors’
  5. Click ‘Regular Expression Extractor’

What are the input fields of ‘Regular Expression Extractor’?

‘Regular Expression Extractor’ has the following input fields:

  1. Name: To provide the name of the post-processor
  2. Comments: To provide arbitrary comments (if any)
  3. Apply to: To define the search scope of dynamic value. 
    1. Main sample and sub-samples: In case, the request is redirected then use this search scope, so that dynamic content can be searched in the response of both main and re-directed request.
    2. Main sample only: When the request is not re-directed or dynamic value is present only in the response of the main request then use this search scope.
    3. Sub-samples only: When the request is re-directed and dynamic value is available in the response of re-directed request then you can use this search scope.
    4. JMeter Variable Name to use: If the dynamic value needs to be extracted from the value of any JMeter variable then you need to select this option and provide the JMeter Variable name in the text field. 
  1. Field to check: It refines the search scope and instructs JMeter to search the dynamic value in the specific part of a sample, sub-sample or JMeter variable depends on the option you selected in “Apply to” section. 
    1. Body: This option instructs JMeter to search in the body of the response. The header of the response does not include in the search scope.
    2. Body (unescaped): The search includes all the HTML escaped code like &amp, &quot, &lt etc. This search scope impacts JMeter performance, hence it is recommended to select this option when really required.
    3. Body as a Document: This option allows JMeter to search the dynamic value in the document returned by the server.
    4. Response Headers: The search will be conducted only on the header part of the response and captured the dynamic value. This option does not apply to the non-HTTP request.
    5. Request Headers: This option instructs JMeter to search in the header part of the request. It is helpful when the request is redirected and the dynamic value is passed in the header of sub-request. This option does not apply for non-HTTP request.
    6. URL: The search scope will be URL only when you select this option. This option is used when the request is re-directed and dynamic value is available in the URL part. The best example of this is OAuth access token capturing.
    7. Response Code: This option is used to capture the response code. Let’s say you have two transaction flows and selection of any flow depends on the pass (Response code = 200) and Fail (Response code != 200) staus of the previous request. In such a scenario you can choose “Response Code” as an option which returns the response code.
    8. Response Message: To capture the response message this option is used. The response message could be like OK, Gateway Timeout etc.
  1. Name of created variable: The name of the variable in which dynamic value will be stored. This is also called a RegEx variable.
  2. Regular Expression: The regular expression statement to capture the dynamic value. If you want to learn how to write a regular expression then refer to this post.
  3. Template: Template helps to capture more than 1 value from a single regular expression. Each template denotes to a group. $1$ represent group 1. $2$ represent group 2 etc. If you use $0$ then it refers to the entire captured string. 
  4. Match No. (0 for random): If more than 1 strings are matched in a response data and you need to capture a dynamic value which comes at a particular place (say 5th place) then you have to give 5 as an input and JMeter will recognize all the matched values on the page but store only 5th value in the regular expression variable. It is as same as the ordinal in the LoadRunner. ‘-1’ is used to capture all the values while ‘0’ is used to pick a random value from the list of match dynamic values.
  5. Default Value: If the regular expression does not match, then the regex variable will be set to the default value (e.g. Not_Found). This is particularly useful for debugging tests.
  6. Use empty default value: If this checkbox is selected then JMeter set the empty string for regular expression variable. This is not recommended because you can not identify whether regular expression working properly or not?

Learn with an example:

Let’s consider, I have identified two dynamic parameters (code and execution) from the response of a page. I can see that their number of occurrences on the same page is 2:

<div>

<form id=”kc-form-login” class=”dialog-form” action=”https://perfmatrix-public-gateway.com/auth/realms/be55d902-3d75-49b1-b703-322196853ef0/login-actions/authenticate?code=dcsncj-93c51-455d-f-4dadxsdc5-c5c54-nbnd-155xsxcssx127b90fdf-24d5-4986-a7f0-5be4a3e9f5b8&execution=343984ya-689d-4b98-8ff5-98561dfre851″ method=”post”>

<form id=”kc-form-auth” class=”dialog-form” action=”https://perfmatrix-public-gateway.com/auth/realms/be55d902-3d75-49b1-b703-322196853ef0/login-actions/authenticate?code=meet9xwRFX4noYrnYFuVRw1xDHOA0_xQOkG5GUyeZzo.d7b90fdf-24d5-4986-a7f0-5be4a3e9f5b8&execution=371997f4-6c0d-4b98-88ec-d74b27af9e86″ method=”post”>

</div>

To extract these values I will apply following regular expression:

JMeter - Regular Expression Extractor
Figure 04

The output will be:
secureID_1=dcsncj-93c51-455d-f-4dadxsdc5-c5c54-nbnd-155xsxcssx127b90fdf-24d5-4986-a7f0-5be4a3e9f5b8343984ya-689d-4b98-8ff5-98561dfre851
secureID_1_g=2
secureID_1_g0=code=dcsncj-93c51-455d-f-4dadxsdc5-c5c54-nbnd-155xsxcssx127b90fdf-24d5-4986-a7f0-5be4a3e9f5b8&execution=343984ya-689d-4b98-8ff5-98561dfre851″
secureID_1_g1=dcsncj-93c51-455d-f-4dadxsdc5-c5c54-nbnd-155xsxcssx127b90fdf-24d5-4986-a7f0-5be4a3e9f5b8
secureID_1_g2=343984ya-689d-4b98-8ff5-98561dfre851
secureID_2=47YezIPEqm_yGCDjCPdAnJhWGtQqZsEHef53NKz5L2Q.0b180542-f3f2-4ba2-aff9-cbcea3ecfd2c371997f4-6c0d-4b98-88ec-d74b27af9e86
secureID_2_g=2
secureID_2_g0=code=47YezIPEqm_yGCDjCPdAnJhWGtQqZsEHef53NKz5L2Q.0b180542-f3f2-4ba2-aff9-cbcea3ecfd2c&execution=371997f4-6c0d-4b98-88ec-d74b27af9e86″
secureID_2_g1=47YezIPEqm_yGCDjCPdAnJhWGtQqZsEHef53NKz5L2Q.0b180542-f3f2-4ba2-aff9-cbcea3ecfd2c
secureID_2_g2=371997f4-6c0d-4b98-88ec-d74b27af9e86

The first parameter “secureID_1” shows all the dynamic values appeared at first occurance. You may not differentiate between extracted values from this parameter because there is no separation between them.

The second row “secureID_1_g” denotes how many groups are formed? In our example, there are 2 groups created.

The third parameter “secureID_1_g0” has the value of full string which is used to extract the desired value.

The forth parameter “secureID_1_g1” represents the extracted dynamic value of code field which appeared at first place in the response.

The fifth parameter “secureID_1_g2” represents the extracted dynamic value of execution field which appeared at first place in the response. 

Now, if I want to use 2nd occurrence value of code and execution parameter then I simply need to replace the original values like this:

JMeter - Regular Expression Extractor
Figure 05

It is pretty straight forward and easy way to capture the dynamic value and pass in the next requests in the script. By the way Regular Expression Extractor is a soul of Apache JMeter and frequently asked in the performance testing interviews.

The regular expression can be tested using RegEx Tester.


One Response

  1. ravi suvvari says:

    Well explained Thanx for Sharing GURUsha ..Ravi SUvvari

Leave a Reply

Your email address will not be published. Required fields are marked *