On the web, clients and servers perform every interaction by passing HTTP Messages between them. To download a resource from a server, a client initiates a client/server transaction by sending the server an HTTP Request Message. The header of an HTTP Request Message contains details associated with the request. In response, the server packages the requested and sends it to the client as HTTP Response Message. This round-trip exchange constitutes one Request/Response transaction. The entire conversation usually last several minutes, and is referred to as a Session.
Imagine saving Response as they arrive from the server. By the end of the session, you would end up with a collection of Response Entities. If you checked this collection for duplicates, that is, for Entities that contain the exact same content you would probably find some, perhaps many.
Since every content request results in a Full Response from the server, eliminating the Full Responses that result in duplicate content also eliminates the associated overhead for the Full Response, which includes:
Eliminating duplicate content provides benefits to an individual user, such as:What is Duplicate Content?
In simple duplicate content is requesting and receiving same set of data form your back-end systems as resonance message. Each time a client requests content from a server, the server returns it as an HTTP Response Message.
Imagine saving Response as they arrive from the server. By the end of the session, you would end up with a collection of Response Entities. If you checked this collection for duplicates, that is, for Entities that contain the exact same content you would probably find some, perhaps many.
Since every content request results in a Full Response from the server, eliminating the Full Responses that result in duplicate content also eliminates the associated overhead for the Full Response, which includes:
- The time it takes for the Request Message to reach the server, for the server to fill the request, and for the Response Message to return to the client.
- The wireless bandwidth consumption.
- The Battery drain from using radio resources.
- Improved application responsiveness.
- Longer lasting battery charge.
- Reduced impact on the user's data plan.
What is the solution for Duplicate Content?
This problem can be solved by incorporating a caching strategy into your client application design. The cache is a process operates as a middle-man standing in between the client and server processes. The cache serves locally-stored copies of Response, so you can refer to it as a Response Cache.There are several ways to implement caching. There are libraries available for this, and some operating systems have functions for it, but the most direct approach is to incorporate Response Entity Cache functionality into your client software code. Caching is important to your application for the following reasons:
- Cached files are available immediately, with no download latency. This makes your application appear faster.
- Battery life. Every data connection drains the device's battery. If battery draining appears too excessive, the application may be uninstalled by the user.
- Data caps. Users have monthly data caps, and resending files over and over can result in additional costs to your customers if they exceed their monthly data allowance and incur overage charges.
- Citizenship: Wireless networks have limited capacity. If you clog the network with extra data, it hurts the responsiveness of all applications (as well as phone calls).
How to test Duplicate Content?
Testing for Duplicate Content is simple and does not require much technical expertise. There are several tools available and they can help you to find Duplicate Content and its implications. I will show you how to test Duplicate Content using AT&T ARO.
Following test scenario I used to find how effectively my AUT is using caching strategy. So I check for my AUT is programed to cache or retain frequently used downloadable content such as Images, Audio, and XML. An application should not download more than 3 duplicate items as displayed in the Caching: Duplicate Content test in the ARO Data Analyzer help you to find following
- Caching is present but is not being utilized correctly.
- ETags in HTTP headers not being utilized correctly.
- Cache headers not established. See Cache Control Test ID 1.2.
- HTTP expiration model not being utilized. See Content Expiration Test ID 1.3.
- From the ARO Data Collector, open task killer and kill all tasks.
- Hide Collector and Launch the app under test.
- Utilize the app under test for 20 minutes. Navigate through the application at a pace that would emulate a real user. Cover all the states that are applicable to the app including:
- Moving back and forth between multiple screens within the application
- GPS State
- Bluetooth State
- Screen Rotation
- Intentionally navigate back through the application to a screen that has been visited previously in the trace.
- Stop trace in the ARO Data Collector
- Transfer the trace to the ARO Data Analyzer
Pass / failure Threshold:
Pass
|
Not the most efficient
result possible, but unlikely to cause significant user dissatisfaction
|
Definitely
inefficient, likely to cause effect that will be noticed by the user in terms
of network delay, battery usage or data consumption
|
Extremely inefficient,
likely to have unacceptable user impact in terms of network delay, battery
usage or data consumption
|
ARO Green icon
|
The lesser of5% of
total content or 0.5 MB of data
|
The lesser of 5% - 20%
of total content or between 0.5 MB – 2.0 MB of data
|
More than 20% of total
content or more than 2.0 MB of data
|